“ Individual data collected by statistical agencies for statistical compilation, whether they refer to natural or legal persons, are to be strictly confidential and used exclusively for statistical purposes. „
A fear that ‘Big Brother is watching us’ pervades 21st century society. We want information about our world, but not at the expense of our privacy and security. We are concerned when we think an organization or government knows a lot about us; our movements, finances, families. So when data are collected for statistics, either directly from respondents or automatically, people need assurances that the data will remain confidential and will not be used inappropriately: that NSOs are not Big Brother.
Such assurances come in the form of confidentiality policies and laws. They cover every aspect from the collection of data to how they are stored, processed and published. They protect the data themselves from accidental or malicious access, release or identification—known technically as confidentiality—and they protect the individuals who provide the data—known as privacy.
‘Microdata’, complete records of all the answers that people give when they complete a survey, are immensely useful to researchers. An aggregate statistic, such as average household income, can be a useful headline figure. But only microdata permit in-depth analysis of how different factors such as location, age, ethnicity or education influence the figures, to help disentangle causes. NSOs have detailed policies about who can access microdata and what they may do with it. Often they can only access it in secure computer rooms in the NSO’s premises. Names, dates of birth or other identifying features are replaced with unique numbers to anonymize the data.
Even when anonymized, an individual could be identifiable in data if they belong to a very small group. The more disaggregated the figures—that is, the more different characteristics used to define them—the greater the risk to confidentiality. For example, in the population of a town there might only be one 73-year-old ethnic minority woman with a master’s degree. NSOs have rules to determine how big a group must be before they can publish information about its members. Published figures might group together the women aged 70-74 to maintain their confidentiality.
Principle 6 also safeguards proper use of confidential data. While NSOs have access to tax files, border crossing records and census returns, there is no way for data to be used to identify tax evaders or track down undocumented migrants. NSOs follow strict rules, typically laid down in law, to ensure that data can only be used for statistical purposes.
Video from Juan Manuel Rodriguez Poo, President of the National Statistics Institute of Spain posted on YouTube at this link
In celebration of the 30th anniversary of the formulation of the Fundamental Principles of Official Statistics by the United Nations, I would like to congratulate the entire statistical community that shares this code of practice. A code that is fully in force and that has made it possible to improve statistical institutions and their products.
Official statistics are a public good in the service of society. They are prepared using reliable, complete and timely information that is obtained thanks to the collaboration of people, companies and institutions.
Principle 6 of this code states that the individual data used to compile official statistics are protected. Statistical confidentiality guarantees that the information will not be published in a way that allows the identification of any of the units studied. And this is the basis of the trust that allows high quality official statistics to be made available. In addition, this data can only be used for statistical purposes, in accordance with this fundamental principle and the provisions from our legal frameworks.
Regardless of whether the data are collected through responses to a survey questionnaire or a census, are collected from an administrative record or from a big data source, each individual record, each person or company, so that no individual can be identified, and no particular action can be taken on the basis of the statistical files. In addition, no data transfers are made that allow any individual processing. This is a commitment of the entire statistical community.
In an increasingly interconnected and technologically enabled world, it is necessary to develop robust mechanisms to facilitate the statistical use of data and, at the same time, to fulfil our commitments to all our respondents. That is why several international working groups are investigating new methods to guarantee the secure use of information. Anonymization techniques, encryption, secure processing, etc., are nowadays part of the continuous training and statistical culture in all organizations.
Statistical science and the institutions of official statistics constitute a reference in the application of methods for the protection of confidential information and serve as an example for other disciplines that have to deal with these issues, such as health research, justice, etc.
INE Spain is very grateful to UNECE for including us in this celebration. This thirtieth anniversary should be an opportunity for us to recognize our achievements, to reflect and to continue improving our relationship with respondents, guaranteeing the protection of their data and thus fostering their trust in statistical institutions.