Data Management Life Cycle

Back
Data Management and Practice

Next
Data Quality

The IDMF incorporates the key principles of the IMF into a schema designed to be applicable to infrastructure data across the asset lifecycle:

IDMF Diagram 15 — Figure 15: IDMF data management lifecycle

The two additional core features of the lifecycle that apply across each stage are governance and security.

The application of the data lifecycle as a standardised process across the life of an asset enhances communication between stages of the lifecycle, as different stakeholders inherit data from previous data custodians.

Create / Capture / Collect

The data management lifecycle begins with planning for the creation, collection, capture or acquisition of data. In the context of the IDMF, it is also the entry point for each stage of an asset’s lifecycle, where data has been shared or inherited from the previous stage of the asset lifecycle.

Determine data needs and requirements across the asset lifecycle

A key component of establishing an agency’s infrastructure data management approach is to determine the infrastructure asset related information needs and requirements. The information requirements of the asset should be defined following Section 5 Data Requirements and be derived from the data that an organisation needs for strategic decision-making about an asset during its lifecycle.

Agencies should understand what information is required about their infrastructure assets to support the organisation’s strategic objectives. These requirements are typically identified through a combination of decisions that are driven by both internal and external business objectives and requirements.

All requirements cannot not be known or planned for at the outset, but a structured approach from the beginning will assist in further definition as future requirements become known. Commencing planning with high-level forecasts will provide a valuable starting point for future development. Adopting the IDMF will allow for flexibility and adaptability – so imperfection at this stage is acceptable.

The information requirements will also need to be defined at each phase of the asset lifecycle. For example, when collecting data at the strategy and planning phase, high-level forecasts or assumptions may be sufficient, such as 20-year regional population projections or 10-year policy impact targets. These high-level forecasts or assumptions may be provided by reference to the NSW Common Planning Assumptions rather than being separately collected or defined. However, at the operating phase, real time occupancy and service data may be required.

For example, if the asset manager requires real time information on the number of people using the asset at any point in any location, the planners can ensure the right smart technologies are installed to capture this data.

Minimise data collection

It is important to have realistic, targeted information requirements and avoid asking for everything ‘just in case’. Collecting more data than is needed to achieve the intended use of the data can create a data processing and storage burden and increase privacy and security risks. It is therefore important to consider the costs of collecting and maintaining such data against the benefits it will provide.

The data needed may also already exist. Before new data is collected or generated, it is important to check that it is not already available via NSW Government or other open data portals.

Determine the most appropriate collection method

Agencies should, wherever possible, minimise the use of manual data entry and rekeying by transition from the use of paper-based forms to online forms. This includes issuing paper-based work orders and forms for responsive maintenance, inspections etc.

Commonly used infrastructure data collection technologies include:

Internet of Things (IoT) sensors – for example fibre optic, wireless, acoustic noise loggers, thermosensors and smart meters
Imagery and measurement – for example, satellite and unmanned aerial vehicles/drones
Closed Circuit Television (CCTV) and Video cameras
Technologies such as Terrestrial Laser Scanning (TLS), Mobile Laser Scanning (MLS), Photogrammetry, Light Detection and Ranging (LiDAR) are being paired with legacy technologies to optimise and expedite data collection processes.

The use of smart ICT and the Internet of Things should be considered with reference to the Smart Infrastructure Policy.

Organise / Store

In order to support Asset Management, organisations must organise and store large amounts of data in an appropriate and efficient manner. The choice between storage options depends on many factors and is never one-size-fits all.

Storage environments should meet organisational information requirements (OIR), be compliant with relevant legislation and policies, be interoperable across different information management systems, and allow for the data to be stored and managed for the life of the asset.

Storage selection will be impacted by specific considerations, including:

Intended use of the data (by whom and for what purpose)
Characteristics of the existing database or information systems
Type and volume of the data to be integrated
The frequency the data will be accessed
The frequency the data will be updated
The speed at which data will need to be accessed, particularly in emergency situations
Currently available technology

It is important to maintain a common environment for data storage. While data may be housed in a number of different source management/storage systems, a single portal can provide a single point of access that draws that data together for viewing and analysis. A single source of truth helps to avoid the duplication that results from storing the same data in different locations. Apart from the extra work needed to update data in multiple locations, duplication increases the risk that data will be amended in one location but not others, increasing the risk of incorrect or out-of-date information being used.

Design for Interoperability

Technical interoperability refers to the ability of different products or systems from different service providers to exchange information between each other so they can work together seamlessly, either in the present or in the future. It requires the use of standards between infrastructure, communication protocols and technologies that may be very different from each other, so that they can communicate with and across each other.

The NSW Smart Infrastructure Policy must also be followed for infrastructure projects subject to the Infrastructure Investor Assurance Framework (IIAF) and ICT Assurance Framework from 1 May 2020. The policy contains requirements relating to interoperability. Agencies should select open technology and/or vendor agnostic platforms where available and suited to agency needs. Agencies should also use open and recognised standards within and between the horizontal common layers of smart infrastructure.

Using open standards and platforms can improve an agency’s ability to change the vendors it has paid to build and support the solutions. It can also reduce the cost of scaling to large numbers of devices and users of the solution. Additionally, it may increase the amount of alternate solution options available now or for future replacements/upgrades.

Use of Internet of Things (IoT) technologies in the infrastructure must also follow the NSW Government Internet of Things (IoT) Policy.

Ensure the storage environment is appropriate and secure

The level of security applied to data storage needs to be aligned with the sensitivity and security classification of the data. Infrastructure data may contain sensitive or security classified information, including legal and financial information. Agencies should follow the NSW Information Classification, Labelling and Handling Guidelines to determine the sensitivity or security classification of the data.

While most infrastructure data will not contain personal or sensitive information, it is important to check with agency privacy, security, data and/or legal experts before selecting a storage option. This is important because under the Privacy and Personal Information Protection Act 1998, data containing personal information will need increased security protection.

There are also storage obligations under the State Records Act 1998. For example, s 21 of the Act states that a person must not ‘take or send a State record out of New South Wales’ unless permitted under the provisions of s 21(2). In general, this requires permission or approval from NSW State Archives and Records.

However, there is a general authority that provides an exemption to keeping State records within New South Wales (GA35). This authority confirms that sending records for storage with, or maintenance by, service providers outside NSW is permitted provided that an appropriate risk assessment has been completed, and records are managed in accordance with all the requirements applicable to State records.

There may also be additional state or federal legislative requirements, particularly for critical infrastructure or telecommunications infrastructure. The Critical Infrastructure Centre website provides comprehensive guidance on Australian obligations for protection of critical infrastructure.

Data accessibility

Accessibility of asset data is not only an important management practice, it is also enshrined in the NSW Open Data Policy. The Policy states that data generated by government needs to be treated as a public asset and, where appropriate, made available as widely as possible. Making infrastructure data freely accessible allows other organisations and the public to benefit from, and innovate using, the data generated. However, it is recognised that risk-based judgements on security considerations need to be balanced against open data drivers. This needs to be done in the context of agencies' specific responsibilities under the Open Access Information provisions in Part 3 of the GIPA Act (for example, in relation to contract disclosure), as well as agency obligations in relation to access applications regarding infrastructure data that may be made pursuant to the GIPA Act.

It is important to align role-based or identity-based permissions with appropriate data classifications, and to recognise that not all data relating to an asset may be of the same classification and may require management of several different permission levels. For example, locations of security systems in correctional facilities, or drug storage in hospitals.

Whether data is made openly accessible or restricted access to trusted users, it is important the data is stored with supporting metadata, data quality and fitness-for-use statements, as well as measurement error and uncertainty estimates. This is important to enable interoperability and re-use.

Analyse / Use

Data analytics is a key part of the infrastructure management process because it allows large amounts of data to be transformed into useful information. Analytics can help identify and solve problems or predict issues that have not happened yet. For example, analytics can be used to optimise infrastructure management by enabling:

more effective and efficient maintenance programs
monitoring asset condition and performance
identifying infrastructure gaps
meeting minimum reporting requirements prescribed in legislation and policy
ensuring effective capacity utilisation and planning.

Data Analysis

Data analysis should be fit-for-purpose. The type of analytics used will depend on the question being investigated. Data analytics used for infrastructure data can be reactive or proactive – these categories reside on a continuum from hindsight to insight to foresight (see Figure 16):
Reactive
Reactive analytics can assist organisations to respond quickly to operational issues or maintenance requirements.

Descriptive – this form of analytics is applied to understand what happened to the asset e.g. past losses, enable loss forecasting and identify the cause of an incident.
Diagnostic – this form of analytics is applied to find out why it’s happening e.g. what process and conditions are creating the situation.

Proactive
Proactive analytics can assist organisations to better plan for the future needs of the asset users, or the asset itself.

Predictive – this form of analytics is applied to predict what will happen to the asset and often uses machine learning algorithms.
Prescriptive – this form of data analytics is applied to understand the best actions that can be taken in a particular situation to change the process operation.

IDMF Diagram 16 — Figure 16: Data Analytics Continuum

Key data analytics concepts

Data Profiling

Understanding the profile of data includes assessment of quality and identifying potential issues with the data, including matching data formats against standards to ensure interoperability, identifying differences from expectations within datasets. Automated tools can be used to streamline these processes.

Data Cleansing

One of the first steps of analysing data (after defining the data requirements, collecting and storing the data) is to cleanse it. Data cleansing is vital for accurate data (incorrect data can generate misleading results). Analysing data and using techniques to automate these error checking methods can help to speed up this process, however a data analyst still needs to be involved to investigate any issues.

Data engineering and modelling

An analyst often needs to combine datasets and build models with multiple data layers to build data insights. Data modelling is when a data scientist builds a data model to correlate the data, often with business outcomes in mind.

Visualisation and communication of data

Communication is the last step of the data analytics process and is often overlooked. Data needs to be delivered to the organisation in a meaningful way to support decision making. Data visualisation is about the visual representation of data as a means of communication. For example, a common way to visualise infrastructure is via spatial systems like the enhanced 3D spatial platforms and digital twins.

Data usage for business objectives

Data generated or collected can be used to continuously track and measure many factors including usage, condition and exceptions. Without clear definitions of what constitutes success at each stage of the asset lifecycle, progress can lag, data collection and use can become costly and untargeted, and the value of infrastructure assets can be diminished over time.

Once immediate business objectives are delivered, leveraging data to identify new ways of operating and new service innovations can be explored. By combining various datasets and learning from previous infrastructure projects, infrastructure data can be used to develop data assets and innovations for use across the state to enable smarter infrastructure planning. Agencies should consider how asset data can be leveraged more effectively across their organisation or be made available for innovation as an open data resource.

When working with data, it is important to remember that data can be closed, shared or open because of the sensitivity of the data, the level of risk associated with the data, and the permissions given on how it can be used and published. By understanding where data comes from, who can use it, and what can be done with it, the opportunities associated with sharing and using data can be optimised.

The Data Spectrum diagram developed by ODI illustrates the differences between closed, shared and open data.

Figure 17: The data spectrum in infrastructure

Benefits and safeguards in sharing data

Data sharing is a fundamental requirement for the management of most infrastructure assets. This is because on most infrastructure projects, there are numerous stakeholders that provide specialised services across the asset lifecycle. The need to exchange data between stakeholders in a timely and efficient way is key to the success of the asset’s management. For example, the management of an asset relies on data exchange between a design team, a construction team, manufacturers and suppliers, as well as operation and maintenance teams.

Sharing of infrastructure data and other data across NSW Government is encouraged, provided appropriate protections are in place. The Data Sharing (Government Sector) Act 2015 aims to remove barriers to data sharing within NSW Government, and to facilitate and improve government data sharing.

However, while NSW Government encourages the release of non-sensitive infrastructure data, the release of data should always take full account of security and privacy considerations. Guidance is available from the Information and Privacy Commission’s guide on Data Sharing and Privacy.

Various arrangements can be made to between organisations to establish data sharing processes, including Memorandums of Understanding or Data Sharing Agreements. Further information is available at Data.NSW.

Closed or Secure Data

Closed or secure data is data that only people inside an organisation can see and use. National security, confidential business reports, and work emails are examples of data that organisations keep secure. There can be good reasons why data is closed and why closed data should not be available in the public domain.

From an infrastructure perspective, closed data is generally associated with sensitive or critical infrastructure or operations. The following references provide guidance on the ability to share (internally or externally to Government) infrastructure data:

Federal government requirements on critical infrastructure assets in the Security of Critical Infrastructure Act 2018
NSW critical infrastructure, including the ability to improve data sharing through the Trusted Information Sharing Network (TISN) for critical infrastructure resilience

Shared Data

Five Safes Shared data is data that is shared with a specific organisation, or group of organisations or people, for a specific purpose. Data sharing is how NSW government agencies can provide authorised access to the data they hold in a controlled manner, to help deliver better outcomes to the people of NSW.

Guidance on sharing of data is provided by Data.NSW, including the Five Safes, also referred to as data sharing principles.

The Commonwealth Data Sharing Principles help agencies to think about all of these factors together and better manage any risks associated with data sharing.

The five Data Sharing Principles (‘The Principles’) provide a framework for government agencies to share data safely:

Share data for appropriate and authorised purposes
Share data only with authorised users
Use data in a safe and secure environment
Apply appropriate protections to the data
Ensure public outputs from data sharing projects do not identify the people or organisations in the data

If the joint protections offered by the Principles are not sufficient to protect against the risk of data breaches or data re-identification, then the data should not be shared.

Open Data

Open data is data that anyone can access, use and share. Governments and organisations have opened up access to data such as weather records, train timetables and real-time running, allowing others to use this data and discover new solutions for the benefit of all. However, simply releasing data is not sufficient and for data to be considered truly open, the owners must clearly state that other organisations or people can use it in any way they like, as without express permission, the data cannot be considered as open.

The NSW Government Open Data Policy is clear about its objectives and defines open data as follows: “data is open to the extent that its management, release and characteristics meet the principles of openness”.

In accordance with the Open Data Principles of the policy, agencies must manage data as a strategic asset to be:

Open by default, protected where required
Prioritised, discoverable and usable
Primary and timely
Well managed, trusted and authoritative
Free where appropriate
Subject to public input

A key component of defining information requirements for infrastructure must include defining. where on the data spectrum from closed to open the relevant infrastructure data fits.

Re-use / Maintain

The re-use of existing infrastructure data assets for additional purposes will be one of the greatest sources of value for NSW government agencies, industry and the general public. Re-use of whole of government data assets such as those presented in the NSW Digital Twin will break down existing silos and artificial barriers to the use of information across cluster, administrative and jurisdictional boundaries. Re-use of data is facilitated by standards, interoperable data systems, common data formats and proactive sharing and release of open data.

Ongoing maintenance of asset data is a core process in infrastructure management. With the collection of large amounts of infrastructure data, it is essential to have processes in place to monitor, maintain and update the data to ensure the ongoing efficiency and improvement of the infrastructure asset. Specification of the frequency of update of data to align to organisational reporting requirements will reduce the friction associated with generation of ad hoc reports and data updates.

Ongoing monitoring of data assets

Continuous monitoring of the effectiveness of infrastructure data collection and use and management of changing requirements is needed throughout the asset lifecycle. Changes could be to the asset themselves, how they are managed, the technology supporting them, or the business requirements driving them, for example, changing stakeholder needs.

Asset owners need to reassess data governance approaches regularly in order to manage changes, privacy and security. This will ensure data is up to date when situations change, without becoming redundant and misleading. Data owners are typically accountable for ensuring that asset data is maintained and will need to establish an Asset Maintenance Schedule at the beginning of the project. This maintenance schedule should be implemented and monitored for the full duration of its life and include clear processes and points of accountability for maintaining and updating the data at each phase of the asset lifecycle.

Data validation

To sustain the required level of data quality, it is important to undertake data validation based on the requirements identified in the data needs assessment. Where possible, data validation should be automated, and errors corrected at the source. Data validation should also include the detection and mitigation of malicious data. Malicious content validation is important to protect not only the integrity of the data but also the system. Malicious data that is not blocked by data validation can be used to exploit vulnerabilities within a system and create an impact on other components in a system as well as other users.

Security considerations

Whether data is at rest, in use or in transit, there must be appropriate security controls in place to protect it. The security requirements of the data and systems may change over the lifecycle of the project, as the core data is updated with new data over time, and the risk profile of the data may change. It is therefore critical to assess risk changes and security requirements across the lifecycle. By assessing and re-assessing, the residual risks as well as newly identified risks can be mitigated with the appropriate controls. For more information, see the NSW Information Classification, Labelling and Handling Guidelines .

Archive / Destroy

When an infrastructure data asset reaches the end of its life, the asset data should be archived or destroyed. To avoid large storage costs (or to minimise the risk of premature data destruction), agencies should assess and identify considerations that apply to the retention and destruction of data such as:

Is there personal information in data assets? If so, under the Privacy and Personal Information Protection Act 1998, personal information should be destroyed as soon as the objective it was collected for is completed, in accordance with relevant requirements in the State Records Act. Note: Personal information may be required to be kept in some contexts, for instance, throughout an ongoing legal proceeding, and should be done so in accordance with the Privacy and Personal Information Protection Act 1998.
Is there health information in data assets? If so, health information should be managed under the requirements of the Health Records and Information Privacy Act 2002.
Data and records should be disposed of in accordance with NSW Information Classification, Labelling and Handling Guidelines. The Guidelines recommend records are disposed of with the same level of security that they are maintained. Guidance on de-identifying information is available from the Information and Privacy Commission.
Are internal or external services dependent on the data?
Are very large volumes of data involved? If so, it may not be economical to maintain the data for long periods of time. Approaches will be needed to routinely purge the data that is not needed for ongoing use.
Are there any audit or accountability requirements applying to the infrastructure asset management process?

The State Records Act 1998 sets the rules for how long all government information needs to be retained. Depending on the type of infrastructure, the data asset will have different legal retention and destruction requirements. Agencies should refer to the NSW State Archives and Records website for more information. Any decision to archive or destroy data must also be made in accordance with the organisation’s records and information management requirements.

All retention and destruction decisions need to be authorised and documented to achieve transparency and accountability over the destruction of infrastructure data assets. Governance and approvals must be defined in the agency’s data governance documentation to ensure compliance with the State Records Act 1998, as well as with consideration of Government Information (Public Access) Act 2009 (GIPA) requirements. If working with multiple service providers, agencies should make sure they can all support and deploy the data retention and destruction frameworks required.

Back
Data Management and Practice

Next
Data Quality

Last updated 29 Jul 2024