Stats NZ wants to overhaul the census and largely replace individual surveys with admin data. Indira Stewart explains what admin data is and how it profiles every New Zealander.
The days of the traditional census survey forms could soon be over.
Stats NZ has a massive transformation plan proposed for the future of our census, the five-yearly snapshot of the people who live in New Zealand.
The statistics department is proposing an "admin data first" approach.
New Zealanders can give feedback online or over the phone on the proposed changes. But many still have no idea exactly what admin data is, where it comes from and how accurate it might be.
Census data is crucial for helping inform government policy and funding towards the country's needs. So, it's important the admin data Stats NZ plans to use is as accurate and as reliable as possible.
What is admin data?

Admin data is essentially information the government already knows about you. It's data that has been collected from government or private sector organisations that you've interacted with. It can also come from past censuses and Stats NZ surveys.
For example, every time you call Inland Revenue and update your contact details, it becomes part of that admin data.
If you get diagnosed with an illness, get a vaccination, or give birth, as long as it's on Health Ministry records, it becomes part of admin data.
If you pay income tax or claim a benefit, that becomes part of admin data. Anyone who has a RealMe account provides details that also become part of admin data.
Admin data is held in what's known as the Integrated Data Infrastructure (IDI), a tool developed by Stats NZ to combine data about each New Zealander from many databases across agencies.
The IDI continues to grow and currently has a list of about 11 million people who have ever lived in Aotearoa.
Stats NZ argues admin data is "very powerful" when the data is linked across organisations to profile a single person.
What about rights to privacy?

This is a concern commonly raised with Stats NZ.
It said when it accesses admin data in the IDI to create a profile of certain individuals it must balance New Zealanders' rights to privacy against the benefits for important information to be shared.
To do this, Stats NZ needs social licence, which it defines as "the permission we have to make decisions about the management and use of the public's data".
The IDI tool combines raw data about a person from different agencies. It then "cleans" the data by linking and matching the information together to determine the most up-to-date and accurate information — for example, a person's address and income.
That data is then de-identified, meaning any personally identifying information such as names, dates of birth and addresses are stripped away. Numbers that can identify people, like IRD and NHI numbers, are encrypted (replaced with another number).
What's left is known as the 'IDI Clean' data set which, according to Stats NZ, is the finalised data set that can be accessed by researchers.
Concerns have been raised around the risks the IDI tool poses to New Zealanders' privacy because a large, centralised store of personal data could be vulnerable to hacking. There are also criticisms that the Data and Statistics Act 2022 doesn't prevent authorised uses of the data from harming people.
While New Zealanders have a right to request what information is held about them inside the IDI under the Privacy Act, Stats NZ usually refuses requests by people who want a copy of their data because it says the IDI tool isn't designed to allow the removal of information relating to individuals.
This means there's no way for people to know whether all the information stored in their IDI profile is correct.
Errors on an individual's profile could be a result of incorrectly linking two or more different people in the IDI. If a person doesn't know there's an error on their profile, they can't have their information corrected.
Who can access admin data?

Researchers who want to access the "clean" admin data are vetted by Stats NZ and must commit to using the data safely. Researchers must be able to show that their request for data is in the public interest.
Once approved, they enter a "data lab" room in the Wellington-based Stats NZ building where they are locked down while they access the requested data. They are not allowed access to mobile phones or the internet while inside the data lab and are also prohibited from taking photos.
Once they've selected the data they need, it's checked by Stats NZ to ensure it doesn't contain any identifying information.
Stats NZ, or any other third party accessing data, is prohibited from disclosing any personal information about a person. However, the Privacy Act states that "if any personal information is made available in good faith", no legal action can be brought against Stats NZ or the third party who shared the information.

How accurate is the admin data?
Admin data in the IDI isn't always perfectly linked, and this can affect how it's used for statistical or Census purposes, Stats NZ said.
This could mean a person is assigned the wrong value for a variable (such as age or income) which would result in a measurement error. Other errors can occur when two different people may be wrongly linked, for example, if their names and dates of birth are very similar.
Stats NZ said its "false link" rate is under 2% for IDI linkages and believes that any impact from false links is likely small.
The accuracy and quality of the linked data also depends on how different government departments, agencies, and past censuses, have recorded a person's data.
In a 2021 report, Stats NZ said variables such as age, sex, birthplace and employment status had close to full coverage and high accuracy, an indication that the admin data was likely to provide high quality census information.
However, data capturing religious affiliations, what languages people spoke, their occupation and hours worked had very low coverage and low accuracy.
"Some variables such as language, religion, or unpaid activities are personal information that is unlikely to ever be captured well by government agencies," the Stats NZ report said.
With improved technology and the use of sources such as cellphone data, social media and commercial datasets, the report said "good statistical information from alternative data sources may be possible in future".
What about the data that can't be captured this way?

What is missing from admin data remains a concern, particularly for minority groups, those living with disabilities and marginalised communities.
Some events aren't captured in the New Zealand admin data because they aren't recorded here. These include children born overseas, marriage, civil unions, or divorce that occur overseas and qualifications obtained overseas.
These data gaps affect new migrants.
Admin data often excludes important information about minority groups who have low engagement rates with government agencies, too. This can include New Zealanders who are homeless, living with a disability or who are transient — often moving from house to house. Undocumented migrants, refugees and overstayers are also unlikely to be captured by admin data.
Last year was the first time the census included questions about gender, sexual identity and variation of sex characteristics.
"It was important that the census represented all people of Aotearoa New Zealand and the collection of that information would enable groups and individuals to use the census data to advocate for the needs of Rainbow communities, the same as every New Zealander," Stats NZ said on its website.
It is not known how accurately current admin data captures information relating to New Zealand's Rainbow communities.
Stats NZ said it will prioritise using admin data sources for future censuses and where admin data is not available or not of a high quality, it will collect data by working with priority communities and use "targeted surveys and bespoke solutions".
So, why the proposed change?
With days to go, Statistics Minister Deborah Russell speaks to Q+A about how she believes this year's Census has gone. (Source: 1News)
The 2018 census was plagued with difficulties and marked the lowest response rate in 50 years at 83.3% — down from 94.5% in 2013. Māori and Pacific rates were considered "unacceptably low" even after being boosted by supplementary admin data. Its final cost was $128 million.
The 2023 Census, which is due to release its first data set tomorrow, had an 88.3% response rate, still below the target of 90%.
But admin data was combined with census form responses and is expected to lift the census coverage rate to 97%. Last year's census cost a record $320 million, more than double the previous census.
In a February briefing to Minister of Statistics Andrew Bayly, Stats NZ said it was becoming harder to motivate people to complete census and survey forms. It also said some people are now more hesitant to participate in public initiatives or provide personal information to the government.
"It now takes more time, effort, and resources to achieve satisfactory response rates," the Stats NZ briefing said.
The current large-scale survey using a traditional method that goes back to the first census in 1851 is no longer sustainable and an admin data first approach would save costs, it said.
New Zealanders can submit feedback on the Stats NZ proposal until June 18. Following public consultation, an independent evaluation will be made, and a proposal put forward to Cabinet later this year.
A second round of public consultation, seeking more detailed feedback, will take place in 2025 before the next census model is confirmed.
This means by 2028, admin data could be the main tool capturing the next historical snapshot of New Zealand and the stories behind our data.
SHARE ME