Design and manage data for sharing

In this section:

To encourage and drive data sharing, it is important to have data which is well designed and facilitates use and reuse.

Based on the Digital Design Standard, the following are elements of good data design that all agencies need to consider when data sharing:

Create with purpose

  • When collecting data, make sure that unnecessary personal information is not captured. Only ever capture personal information if it is directly needed for customer service and if the customer has given consent.
  • Make sure all your datasets have an owner who is responsible for data management and who can make decisions about who can access and use them. This person will be responsible for making decision about who can and can’t share this data.
  • Help people to know the lineage, context, background and reporting associated with your data so that they can understand and trust the data, and know how to use it in their work.
  • Look for opportunities to share current or real time data. Publishing live, real-time feeds can be incredibly valuable for some types of data. Make sure you use dates or timestamps to help users identify the age and relevance of your data.

Prioritise high value data for sharing and release

All agencies should prioritise the sharing of data that:

  • Describes the performance or use of government services – this can help other agencies with planning and service provision. This data also informs the public and helps them to make informed choices. It can encourage potential new suppliers to make informed decisions about market and client characteristics and benchmark performance. Releasing this data supports accountability, transparency and better planning.
     
  • Is used regularly by a number of business areas or agenciesThe data may support joined-up services for a better customer experience. It may reduce duplication and administrative costs. Consider making data available through an  API.
     
  • Provides a foundation for decisions – If data establishes a basis for planning and evaluation by other agencies, individuals or organisations, there is likely to be value in the ongoing release or regular updates to the data. For example, population, household and dwelling projections. Similarly, if data can be layered or combined with many other types of data to generate insights, it is likely to have high value. For example, administrative boundaries can support socio-economic analysis, service distribution or regional planning. They can also be used for aggregation or visualisation of open data.
  • Represents a substantial investment of public resourcesData collected or produced during a research or evaluation project will often have high value for other agencies and the community, like research leading to recommendations for policy, programs, legislation, fees or large projects. Data from public surveys, cost-benefit analyses, or environmental impact assessments for example can be very valuable to share. Data that represents an entire population or domain is also valuable for analysis of patterns and trends and is a valuable cross-sector resource.

  • Supports transparency and accountabilityReleasing data can promote open discussion of public affairs, or contribute to a positive and informed debate on issues of public importance. The data might support effective oversight of public expenditure. The data could reveal or substantiate cases of misconduct or negligence. Releasing high-value data can enhance government accountability and transparency to the public.

Respect privacy and maintain security

  • Make sure any datasets that contain personal or sensitive information are identified and well protected.
  • When sharing data, make sure personal information is protected.
  • When sharing data, make sure any sensitive information is protected.
  • Good advice for assessing data and applying privacy and sensitivity protections is contained in Safeguarding your data.

Practice data minimisation

Unless strict safeguards have been applied, personally identifying information cannot be released or shared.

All agencies should only capture the minimum amount of personal data necessary for each transaction. This will also minimise the need for anonymisation to remove any personally identifying information before release.

Depending on your business processes you should:

  • Ensure individuals only provide the minimum amount of personal information needed for a specific transaction or interaction
  • Enable the public to interact with your institution anonymously or pseudo-anonymously
  • Generalise data being collected (for example, ask for an age range rather than specific age, or a postal code instead of street address)
  • Provide clear notices at data collection points when the provision of personal information is not required (for example, in community engagement or feedback processes)

Ensure contracts with third party service providers require them to minimise the personally identifying data they are collecting and managing.

Design with users, for users

  • Engage with your customers around data collection, management and use and make sure full consent is obtained for all your data collection and use.
  • Talk to the community, research sector and industry about the type of data they want your agency to release, and how they want this data released.
  • Be open and transparent and respond to feedback about your data.
  • Make sure your is data useable by providing good metadata that uses plain English to describe what your data is about, how it was created and collected, and any relevant caveats or known limitations of the data.
  • Provide data dictionaries and other documentation that explains your data. People in other parts of your organisation or other parts of government aren’t going to know what your codes, abbreviations or specific terms are. Making your data dictionary available alongside your data, helps make the meaning of your data clear. For example, your dataset could contain data in the field ‘school code’. The NSW Department of Education has its school codes, but the federal government uses different school codes. Your data dictionary would explain which school codes you are using, which then helps people to interpret your data correctly.
  • For geospatial datasets, document which geographic coordinate system, map projection or datum the data is in. If the geospatial data was transformed from another datum, the transformation method should also be noted.

Engage with customers about data

To build data that helps deliver customer value and that establishes trust with the community in government use of data, it’s important to engage with customers. Customer engagement about data should:

  • Be genuine
  • Build trust and collaboration
  • Be transparent and ongoing
  • Inform how government collects and uses data
  • Inspire and empower consumers to use government data
  • Look to maximize value of data to the public
  • Be responsive
  • Grow awareness of the use, benefits and opportunities of government data use
  • Identify priorities and data demands for different community audiences
  • Target specific challenges and attempt to identify appropriate solutions

Customers feedback should also drive government data release priorities, strategic data priorities, data use, real-time release, data visualisation and innovative data partnerships.

Reuse and repurpose

  • Drive data sharing across government, to help improve service outcomes and government decision making.
  • Make sure your data is machine readable, in a format that makes it easy to process and use. Making data available through APIs automates the production of up to data machine-readable data.
  • Use common standards for your data wherever possible, so that it’s easy for your agency and others to know exactly what your data refers to, and how to compare your data to other datasets.
  • Apply data quality statements to all your data, so that people know how your data was created, what it can help them to understand, and any limitations it may have. Use the Data Quality Reporting Tool to easily generate one of these statements for your data, and publish the statement generated alongside your data listing.
  • Look for opportunities to leverage data from Data.NSW in your own business environment
  • Work with Government Information (Public Access) Act teams, and analyse common customer service enquiries, to identify data that can be routinely released. 
  • Examine information already published on your agency website to determine if the data published in PDF in annual reports, budget papers, reports, grant funding etc can also be released as machine readable data.

Continuously improve

  • Focus on building and improving the quality of your data.
  • Be open to feedback and listen to users of your data, who may have advice on how to improve your data collection and management to make your data more usable.
  • Look for opportunities to share or release more data. This could involve including requirements in all contracts for service providers to share or release the data they collect, ensuring all contracts state that any data is owned by the government and must be returned to government custody at the conclusion of the contract, or making more data available in accessible, machine readable form.
  • Corporate reluctance to share and release data is often driven by quality concerns. Improving data governance and focussing on data quality can make it easier for agencies to start sharing their data. Documented data governance frameworks, data quality requirements and regular monitoring of data processes can build maturity and create a culture where all staff are aware of their data responsibilities and aware of how their work contributes to data initiatives and customer outcomes.
  • Once your data is available, keep it updated. If your data changes often, the best way to ensure it’s refreshed is to put automated processes in place to publish regular updates. Consider publishing your data via an API to make access to up to date data automatic.

Be open, accountable and collaborative

  • Be open and transparent with the community and release as much as possible as open data.
  • Communicate clearly and frequently with the community about how your agency is using data, and how data is contributing to better customer services and community outcomes.

 

Last updated: 19 June 2019