Swipe Icon
Census Recommendations
The population and housing census is an important pillar of a national statistical system, providing data on the population and its social, demographic and economic characteristics. In June 2025, the United Nations Economic and Social Council adopted a resolution urging Member States to conduct at least one population and housing census under the 2030 World Population and Housing Census Programme, from 2025 to 2034. The Conference of European Statisticians Recommendations for the 2030 Round of Population and Housing Censuses provide guidance and assistance to countries in the planning and execution of their population and housing censuses. They reflect the reality and needs of countries of Europe, North America, Central Asia and other countries participating in the Conference of European Statisticians. The Recommendations facilitate and improve the comparability of census data through the identification of a core set of census topics and the harmonization of concepts, definitions and classifications.The Conference of European Statisticians endorsed the Recommendations in June 2025.
UNECE
November 2025
Chapter 5 Technology
5.1 Introduction
208.
208. Technology has been used to assist in all phases of population and housing censuses. This chapter addresses general considerations and provides recommendations for the use and testing of technologies for the census. The focus is on the technological solutions that have become widespread in the 2020 round of censuses, such as electronic data collection tools and cloud technologies. Technologies for using the data from administrative sources for census purposes are also discussed. It is understood that new avenues for the use of technology, such as machine learning and artificial intelligence, are likely to unfold in the years leading to the 2030 census round.
5.2 Evaluation in advance
209.
209. Technology has the potential to improve the coverage and data quality of the census, reduce costs and enable timelier dissemination of results. However, in the short term, the introduction of new technologies may increase costs. NSOs need to consider how the new opportunities provided by technological innovation may contribute to improving the data and operations as well as the environmental footprint of the census. Adoption of new technologies or methodological approaches should only be considered where there is a sound understanding of their benefits and where their developments can be managed.15

15 For a detailed discussion on the decision-making process for using a particular technology for collecting census data, see section B.4 in United Nations Department of Economic and Social Affairs (2019), Guidelines on the use of electronic data collection technologies in population and housing censuses. New York: United Nations.

210.
210. The feasibility of adopting any technology that is previously untested in a census environment should be carefully evaluated in advance, taking into consideration the national context, the relative costs compared with traditional solutions, the work needed for development and testing, the potential implications for the overall organization of the census operations, the potential effects on the quality of census results, and the impact on the general population and on the environment.
211.
211. NSOs should undertake an evaluation well in advance of the census to determine what systems and processes are appropriate for their own situation. Issues to be considered include:
(a) The relative costs of staff and clerical-based processes compared with costs of possible computer systems and associated infrastructure;
(b) The technological capability and infrastructure within the NSO and the country as a whole;
(c) The capacity of the NSO to manage complex and sophisticated systems development processes;
(d) The availability of funding and time for developing and testing the technological solution.
212.
212. Before the adoption of new technologies or methodological approaches, the NSO should have a clear understanding of the associated risks. How will the new technology perform? What will be the reactions of respondent, or census staff? Because of the long intervals between census cycles (5 to 10 years), opportunities to learn first-hand from new approaches may be limited.
213.
213. In considering the introduction of innovations, the NSO should strongly consider learning from the experience of other census agencies internationally. Consideration may also be given to collaboration with other organizations in jointly testing new approaches or technologies before their introduction.
214.
214. The complexity of much of the new software and the infrastructure required for many of the new and emerging technologies may go beyond the current technical capabilities of a NSO to manage. It is likely, therefore, that some countries will want to consider whether significant components of technical solutions to the census operation could be outsourced (see section 4.4).
5.3 Testing
215.
215. The NSO should adopt a strong testing strategy and consider testing activities as a priority for the census. From the technology perspective, testing should pertain to functional correctness, non-functional requirements, interfaces, error handling, performance, reliability, usability and security. Beside the testing cycle, the deployed solutions must be certified for cybersecurity.
216.
216. It is recommended to include the following components in the testing cycle for census technology:
(a) Unit tests of individual software components to identify bugs and facilitate code changes; they are usually automated and can be run frequently;
(b) Field operations tests for new or modified procedures and technologies in data collection;
(c) Systems and operational readiness tests in the production environment;
(d) Tests for backup and recovery of data to verify that data can be reliably restored to its original state after being lost, corrupted or compromised;
(e) Tests against ransomware attacks, to evaluate the effectiveness of defences and response and recovery capabilities;
(f) Integrated volume and performance tests, also known as load or stress tests, simulating peak load on user interface and peak size of data;
(g) Load stability measure test for the correctness of the system configuration, hardware and software for data processing;
(h) Security testing – Conducting penetration tests to identify vulnerabilities and validate security measures;16

16 See for instance the Web Security Testing Guide developed by the Open Web Application Security Project® (OWASP®) Foundation.

(i) Non-functional testing for the performance, reliability, utility and other non-functional aspects of a software application;17

17 See for instance Pezzè M., Young M. 2007. Software testing and analysis: process, principles, and techniques. Wiley.

(j) Functional tests – Ensuring that all software functions operate according to specifications and meet user requirements;18

18 Idem.

(k) Tests for new online services – Evaluating accessibility, functionality and user experience of newly implemented digital census platforms;19

19 Idem.

(l) System-integrated testing or interface testing of integration points between two or more systems;
(m) Integrated end-to-end test of all integrated systems;
(n) System acceptance test to confirm both functional and non-functional system requirements, using a comprehensive set of scenarios.
217.
217. Testing of the efficacy and coherence of the entire census system, including any outsourced technology, is especially important where respondents are provided with multi-mode options for completing the census returns, such as a mobile phone, tablet or desktop computer. Each of these modes need to be tested for accessibility, user experience, availability, performance and security.
218.
218. In the context of the rapid development of IT and the resulting significant changes in many technological elements of the census after the 2020 round, mandatory testing of the entire technological solution in a pilot census becomes especially important in preparing for the 2030 round census.
5.4 Management
219.
219. Census operations involve a range of administrative processes that are common to other large-scale projects. For example, the planning of a complex operation such as the census would benefit from the use of project planning software and systems and applications for recruiting and paying large numbers of temporarily employed census enumerators. The NSO should consider how technology might assist in improving the efficiency and effectiveness of these operations. This can contribute both to containing the cost of the census and improving the overall quality of the census by allowing resources to be focused on the primary tasks of enumeration, processing and dissemination.
220.
220. Technology solutions are now available which can combine multiple field management functions. Customer Relationship Management (CRM) can provide solutions for field application management, front-end website, enhanced communication through chat-bot functionality and a citizen helpdesk for knowledge management.
221.
221. Modern technologies provide opportunities to improve the management of field operations and thus the quality of the census itself. Multi-modal collection operations require that timely information be provided to census enumerators so that they do not visit households that have already submitted a census form. This is both an efficiency issue and a public relations issue.
222.
222. While the key issue is the flow of timely information to the census enumerator, the same systems should also provide for a close to real time two-way flow of information between census managers and enumeration staff. Such monitoring of enumerator work will allow for more timely interventions where the data collection process is falling behind schedule or there are some problems with the quality of the data collected.
223.
223. The NSO may need to rely on external organizations for key parts of the solution. Regardless of whether these systems are internal or external, they must adhere to internationally (e.g. ISO27000 family), regionally (e.g. NIS220 in the European Union) or nationally agreed cyber security standards. This is especially important as software for various external or internal operations (collection, processing, geographical information systems, imputation, dissemination) are provided by a variety of solutions (e.g. one-stop-shop, proprietary, custom developments).

20 Directive (EU) 2022/2555 of the European Parliament and of the Council of 14 December 2022 on measures for a high common level of cybersecurity across the Union.

224.
224. A key factor in reducing risk is the relationship with the technology partners (contractors). Strong governance where the in-house census management team remains at the core, is critical to ensuring census systems are designed, implemented and delivered successfully.
225.
225. A census management infrastructure should contain the following elements:
(a) a register of dwellings with addresses and geospatial coordinates, in which all addresses are distributed by enumeration areas;
(b) a register of enumerators and their contact details with the possibility of linking each of them to a certain enumeration area (from the register of dwellings) and the addresses that it covers – if this data collection method is used;
(c) a register of devices for data collection (for example, tablet computers or smartphones) and their unique serial numbers with the possibility of linking them to the enumerators (from the register of enumerators) – if this data collection method is used;
(d) a central census data storage facilitating the collection, processing and accumulation of all data linking to the respondent's residential address (from the register of dwellings) that are received by all data collection methods (online responses, enumerators data, data from administrative sources);
(e) communication software to enable timely exchange between enumerators, supervisors and the census management team.
226.
226. All elements of this infrastructure should be interconnected and managed centrally using software and technology tools specially developed for the census purposes. It is recommended to perform the installation, configuration and performance check of the devices in the census management centre before delivering the devices to the field personnel.
227.
227. An integrated field communication system can use, and build on, already existing IT infrastructure. This implies, for example, direct access to the national geocoding infrastructure that provides data of single addresses. The geospatial information (addresses, buildings, cadastral parcels) should rely on using permanent identifiers (see Chapter 8).
228.
228. The central census data storage management tools provide the following functions:
(a) Downloading of forms completed online (if this data collection method is used);
(b) Verification of the forms’ suitability for processing (availability of answers necessary for the respondent's identification);
(c) Linking completed forms to the addresses of dwellings;
(d) Creation of a confirmation, such as a QR code, of the successful participation in the census for the respondent's feedback (depending on the method of receiving the completed form, to the personal account of the online census respondent, by email or other means) and for the transmission to the enumerator – if this data collection method is used;
(e) Downloading completed electronic questionnaires from the enumerators’ devices, from administrative sources and other data collection channels used in the census, linking them to the addresses of dwellings;
(f) Consolidation of versions of completed census forms obtained from all data collection channels used in the census relating to the same dwelling address – selection of the reference version, complementing it with data from other versions, removing duplicates;
(g) Calculation and visualization of statistics for monitoring the progress of the enumeration, to identify issues requiring an intervention or adjustment.
229.
229. Training and technical support for enumeration staff is an important issue. It should not be assumed that the people who are likely to be recruited for enumerator tasks are technically competent. Technology would facilitate the provision of training and technical support to enumerators, in particular if this needs to be done remotely.
230.
230. NSOs could consider remote-access technology to support flexible working arrangements of staff who process the census data while ensuring data security and confidentiality.
5.5 Direct enumeration
5.5.1 Internet
231.
231. For direct enumeration, it is recommended to offer the option of responding over the Internet as the first or preferred option.
232.
232. Responses to electronic questionnaires reduce overall collection cost and achieve data of better quality. Like other electronic data collection modes, data processing time and cost are reduced compared to paper-based modes because data will be formatted and can be uploaded or captured directly into databases. Data quality may also be higher because the instrument can contain built-in edits and prompts.
233.
233. Using the Internet as the medium means that the data is collected though self-enumeration rather than an interview. Self-administration of responses over the Internet does not require a visit of the enumerator and thus eliminates any influence that enumerators may have on the responses. By providing more privacy, it benefits the respondent’s self-correction and edits.
234.
234. The Internet option can be incorporated into any of the traditional methods of delivering and collecting census forms such as drop-off/pick-up, mail-out/mail back.
235.
235. For all the above-mentioned reasons, the Internet response is clearly preferable option for data collection in direct enumeration.
236.
236. The key factor to be considered is managing the collection control operations – that is, ensuring that every household and individual is counted once and once only. This requires the ability to link each household and any individual within the household to its geographic location. Furthermore, if the design of enumeration additionally includes the collection of forms by enumerators, they must receive suitable and timely feedback to update their own collection control information so that they do not visit households that have already responded.
237.
237. The potential level of take-up of an Internet option should be considered by assessing the proportion of the population who can access the Internet from home, the proportion who use broadband services, or the general use of the Internet for other purposes such as banking, filing tax returns or shopping.
238.
238. Systems and processes that allow for Internet return of census forms will also need to be developed. These are proven to save costs by reducing enumerator workloads, data capture, printing and postage.
239.
239. Data security is a very important issue and should be a key consideration in designing the infrastructure. Physically separate infrastructures should be set up to collect and to process the census information. Completed individual census forms, after their collection and capture, should be moved into a secure data processing infrastructure that is separate from the collection infrastructure.
240.
240. A standard census questionnaire that is downloadable from the Internet requires much less infrastructure than a form that is completed online. However, downloadable forms generally require a greater level of computer literacy than online forms. They will not necessarily work on different computer configurations and there will be an expectation that the NSO will be able to deal with each individual problem. Recent experience has shown that respondents generally prefer completing the form online. For these reasons, forms for online completion are recommended.
241.
241. Adopting the Internet response option requires the provision of credentials (usernames, passwords) to the respondents for accessing the online form. Methods of delivering the credentials include
(a) Mailing the paper forms or letters
(b) Delivery by enumerator directly to the respondent’s address
(c) Sending by email
(d) Sending by Short Message Service (SMS)
(e) Using the credentials of online public service portals or other online services that require a personal identification number
242.
242. An online form offers the possibility of interactive editing to improve response quality that is not possible on a paper form. Respondents expect that the form offers guidance – at the very minimum that they will be sequenced through the form and asked questions that are relevant to their situation. To ensure a high quality of data collected via the Internet, it is important to provide mechanisms to control response errors on the form. Such control should be conducted in real time, and the respondent should be immediately able to modify any incorrect data. If contradictions are found in the respondent's answers, the online form should identify this and provide the respondent with the opportunity to correct one or more answers, or delete them, or confirm that reported situation exists in real life (even if it is not provided for by the developers of the form). However, a balance needs to be struck so that respondent burden of error checks is not so great that it discourages people from getting to the end of completing their form, reserving hard checks to priority questions such as age and sex. Another benefit of online design is that it may be designed to allow individuals to complete their own elements of the form more easily within a household design.
243.
243. Providing the Internet option may contribute to improving the quality of the census by making it easier for some hard-to-enumerate groups to respond. Most countries report difficulties in enumerating particular population groups, for example, young adults and people living in secured accommodation where access is restricted. Some people with disabilities may also find it easier to complete an Internet form than a paper questionnaire. These groups are also more likely to be using the Internet for other purposes, and therefore, if available, this option should be promoted to these groups as a means of encouraging participation in the census.
244.
244. Provision of sufficient infrastructure provides one of the major challenges for offering an Internet option. The census enumeration takes place over a relatively short period of time and involves the whole population of a country, and it is unlikely that the NSO will already have the needed infrastructure to cope with the peak demands of a census. It is therefore likely that the Internet solution can justifiably be outsourced. It may be necessary for collection procedures to be modified to constrain demand. For example, staggering the delivery of census questionnaires or invitation letters or requiring people outside predetermined target populations or areas to contact the NSO before they can use the Internet form may be a means of restricting use of the Internet form.
245.
245. Census agencies should, therefore, assess how they wish to promote the use of the Internet. Such promotion should be determined by the capacity of the service to handle the expected load and should be coordinated with other data collection procedures. The public relations strategy should encompass assurances about the security and confidentiality of the information supplied via the Internet. Assuming that the Internet option is targeted to the whole population, the public relations strategy should also encompass managing public expectations about the ability to access the site during periods of peak demand. Simple messages of so-called “graceful referrals” advising people to use the Internet option at off-peak times should be prepared and used, if necessary, on the census Internet site itself, through any census telephone inquiry service and in any media promotion.
246.
246. The take-up of the Internet response option can be expected to increase above the levels observed in the 2020 census round. During the data collection, census agencies should constantly monitor the levels of public response and make an effort to increase the level of online response if necessary.
5.5.2 Portable devices
247.
247. The increasing sophistication and the reduction in unit costs of communication using laptop computers, tablet computers and smartphones means that these may be a cost-effective solution for census data collection. Possible applications for such devices include the replacement of enumerator paper maps, address registers and lists and as a means of data collection in the field. They have possible applications in the full range of census collection methodologies from drop-off/pick up through to the collection of the census questionnaires.
248.
248. Portable devices have the advantage of being able to provide real time two-way management information. Census managers can be informed of the progress of the collection operations as the enumerators deliver census forms and collect completed returns. Likewise, census managers can provide the enumerator, via the portable device, with updates on forms received and on households that need to be followed-up. Additionally, geospatial information for the collections (e.g. missing addresses, new developments) can be exchanged to allow efficient use of resources. Census managers can identify, in real time, areas where the enumeration is falling behind schedule or not meeting quality standards and instigate appropriate interventions (consider section ‎5.4).
249.
249. Centralized management of the portable devices for the census data collection includes the automation of the following functions:
(a) Installation on the devices of the following: special software for the census data collection by the enumerator (the electronic questionnaire); list of dwellings addresses and maps of the enumeration area; metadata and classifiers used in the electronic questionnaire; tools for the monitoring the operation of the device and the enumerator; training materials for the enumerators;
(b) Linking between the device, the enumerator and the enumeration area as elements of the corresponding registers (dwellings, enumerators and devices), update of the registers and links between their elements in the event of the enumerators or devices being replaced;
(c) Online management and monitoring of each device’s operation after its initialization by the field staff;
(d) Clearing the device of all information and software used for the census purpose after the successful transfer of the collected data to the central data storage at the end of the census and preparing the devices for a long-term keeping (conservation) or use for other purposes (for example, transfer to another agency).
250.
250. The online management and monitoring of device’s operation includes:
(a) Obtaining information about the time when the device is turned on and off;
(b) Obtaining geo-coordinates for completing each form and recording data about the data collection process, such as interview duration and the number of completed questions, errors and corrections in the forms;
(c) Transmission to the devices of the addresses and the identification data of online respondents to enable the enumerator to verify the completeness of the enumeration at each address and make corrections if necessary;
(d) Remote installation of software updates on the devices in case of emergency. It is recommended to use this only when critical software errors are detected during the census, since it is important to ensure the consistency of data collected using different software versions;
(e) Locking the device in the event that a field worker reports the loss or theft of the device to prevent illegal use of the device or information leakage from it;
(f) Providing a means of remote consultation of the enumerator with the central census office.
251.
251. Use of portable devices should allow greater opportunities for increased efficiency in data collection. However, several technical issues need to be considered in using such devices:
(a) Screen size may affect the ability of the enumerator to record and verify responses accurately. For the same reason, responding with mobile devices over the Internet risks fragmentation of data due to the small size of the screens.
(b) The compact and lightweight devices with sufficiently large storage capacity are most convenient for the field work of enumerators. The brightness and contrast of the screen should be adjustable to use the device both in bright and in dark light.
(c) To ensure the safety of data, completed information should be held in the devices for as short a time as possible. This time depends on how often the data synchronization processes are done, and is also determined by the time required for enumerators to finalize and verify completed questionnaires before synchronizing with the census data storage.
(d) Devices should be able to deal with being offline for periods of time. The length of battery life should be considered in relation to the daily workloads of field staff. It may be worth providing an additional power bank for the device.
(e) If system and software updates have to be made at the data collection stage, it is necessary to avoid the risks of loss of previously collected data or their inconsistency with the data collected after the update;
(f) The GPS accuracy (e.g. in densely populated urban areas) and the mobile signal reception (e.g. mountainous or forest areas) may not be satisfactory on some areas of the country. An assessment of mobile web connectivity should be done particularly if the portable device uses web-based collection.
252.
252. Solutions based on portable devices should be extensively tested before the census phase, both on their own and in interaction with other elements of census technology that do not use portable devices.
253.
253. There is also a range of security issues associated with the use of portable devices:
(a) There is a greater risk of being stolen or lost compared with bundles of paper forms. However, regular uploading of the data from such devices should minimize the need to re-enumerate areas if the devices are lost.
(b) Measures are needed to protect the confidentiality of any data on the device, in the event of loss of the device, and in transmission of the data. Data stored on the devices should be encrypted and only accessible through dedicated protection measures (e.g. passwords, fingerprints);
(c) Transmission of the data also needs to be secured through encryption and use of secure channels end to end;
(d) Security software should be loaded to the device and must be compatible with the other applications on the device. However, security software and passwords add an extra level of complication in use. These security measures will add to the support costs.
254.
254. The training tools for the portable devices should be uploaded to the device for the convenience of their use by the enumerators for the training and during the field work. They should cover all the elements of the enumerator's work, be interactive, have easy navigation and contain illustrative examples of the enumerator's reaction in all possible situations of using this device.
255.
255. Census agencies should think ahead about using the large number of devices after the census. It is impractical to store devices for the next census since they can become technologically outdated and unusable in 5 to 10 years without using and recharging. Census agencies may transfer some of these devices to other users (e.g. the government organizations) while keeping some of the devices.
5.5.3 Telephone
256.
256. In the past, automated telephone interviewing has been suggested as a potentially cost-effective solution for countries that have a short-form census questionnaire requiring only the capture of basic demographic information. However, no country applied it in the 2020 census round. Automated telephone interviewing is not recommended.
257.
257. Computer Assisted Telephone Interviewing (CATI) method can be used to collect data via the census questionnaire and/or to verify and complete any missing data collected on a long-form questionnaire. The user-friendliness of such systems decreases greatly as either the number and complexity of the questions increase or the number of people in the household increases.
5.5.4 Design of the electronic questionnaire
258.
258. The design of the electronic questionnaire is a very important part of the technological solution when responses are provided online or collected using portable devices. The design of the electronic questionnaire should take into account the following requirements:
(a) Contain a complete set and a clear sequence of the questions, which are divided into open and closed questions;
(b) Contain skip patterns by automatically displaying only the relevant questions and skipping those that are irrelevant or not applicable to particular respondents;
(c) Consider all branching paths of the questionnaire, including for rare situations, so that each question is addressed to at least some subset of the population;
(d) Provide response options with the choice of only one option or several answer options for closed questions, and if the "other" option is selected, allow the capture of the respondent's own answer;
(e) Fit the entire one question and its answer on the device screen without scroll or skip to the next screen, if possible, because the hidden part of question or answer options may be missed when answering it;
(f) Make available the help option as a text hint or a jump to the appropriate element of the metadata or training materials;
(g) Provide for easy navigation between the questions to one respondent, between members of the same household and between different sections of the questionnaire (e.g. about the housing conditions, about the household, about the person);
(h) Use built-in controls for the validity of the entered data, taking into account previously entered information about the respondent and other members of this household;
(i) Display a progress bar for questionnaire completion as well as the general quantitative characteristics of the completed questionnaire, such as number of persons in the household and the percentage of relevant questions answered.
259.
259. In the case of using an electronic questionnaire for both online self-completion by respondent and for enumerator’s device, the design of the questionnaire may differ because the respondents have no knowledge of the census methodology, whereas the enumerator is pre-trained and familiar with census terminology and the metadata built in the questionnaire.
260.
260. The online form should additionally contain a summary of the basic requirements for completing it by the respondent. These include:
(a) A description of the general structure of the questionnaire and the sequence of its completion;
(b) An estimated time of completing the questions per one respondent or one household;
(c) A description of ways to call up help information and respond to error messages;
(d) A description of the possibility to correct, delete or add some information to the previously completed questionnaire, if necessary;
(e) The signs of successful and unsuccessful completion of the census process and further actions of the respondent (e.g. obtaining confirmation of participation in the census);
(f) A way of feedback to NSO, such as a telephone number, email address of the census hotline or of the NSO, to evaluate the quality of online services or ask questions that were not answered when filling out the form;
(g) The ability to get translation of the form into the most popular languages in the country if necessary;
(h) The provision of answers to frequently asked questions with terminology accessible to respondents and a link to the page of the NSO website where the legal, methodological and organizational principles of the census are described.
5.5.5 Technology to support the enumeration of people with disabilities and without Internet access
261.
261. When introducing new technologies, it is necessary to keep in mind that the census must cover the entire population, regardless of the used technical equipment and the respondents’ proficiency of computer use.
262.
262. Technology can support the enumeration of the impaired and digitally disconnected in two main ways: (a) in reaching respondents that do not have the necessary Internet connection; (b) to allow for as many people as possible to enter their responses electronically.
263.
263. The digitally disconnected. In order to reach as many people as possible it will be required to identify locations that do not have proper Internet connectivity. One aspect that NSOs should consider in such cases is to offer a paper option. However, other options should also be explored such as deploying enumerators to gather information with portable devices using the Internet over a satellite, or giving the respondent a phone number to call so they can fill out their form by telephone.
264.
264. Accessible internet response. Another aspect to consider is how people with disabilities could fill out their census form over the Internet. Internet response should follow accessibility standards as defined by the Web Content Accessibility Guidelines (WCAG) 2.2.21 While adhering to those standards technically, it is important to consider accessibility already at the stage of developing the content. Before developing any new features that could impact accessibility, consultation with a centre of expertise in accessibility and a user experience group should be performed to ensure that new functionalities are designed to be properly accessible. Examples of the applied features include:

21 World Wide Web Consortium (2024). Web Content Accessibility Guidelines (WCAG) 2.2.

(a) Hidden text for auto-generated character mask fields, to inform vision impaired users of the necessary inputs required to be typed by the user and those that will be provided automatically;
(b) Required colour contrast to ensure users can view text content;
(c) Techniques for associating labels with interactive controls to allow assistive technology to recognize the label and present it to the user, therefore allowing the user to identify the purpose of the control.
5.5.6 Data capture from paper questionnaires
265.
265. Based on practices in the 2020 census round, it can be assumed that most countries with direct enumeration in their next census will use the Internet response option. The use of paper forms and optical recognition technology could be assumed as limited. Data collection with paper questionnaires may nonetheless be necessary because of respondents’ preference or lack of access to the Internet. In comparison to data collection over the Internet or by using electronic devices, this requires additional processing steps such as scanning, data capture and possibly also keying by operators. The fact that the system needs to interpret different handwriting contributes to the complexity of the process.
266.
266. For processing the paper questionnaires, it is recommended to use automated processes such as Intelligent Character Recognition (ICR).
267.
267. Optical Mark Recognition (OMR) can be a cost-effective option where the census form contains only tick-box responses. Additional means of data capture or computer-assisted coding operation are required to handle write-in responses. However, OMR has largely been superseded by ICR technologies.
268.
268. The most cost-effective option is likely to be a combination of digital imaging, ICR, repair and automated coding. An example of this process is briefly described below.
(a) The census forms are processed through scanners to produce an image. Recognition software is used to identify tick box responses and translate handwritten responses into textual values. Confidence levels are set to determine which responses are of acceptable quality and which responses require further repair or validation;
(b) Automated repair is designed to reduce the need for operator intervention and typically involves the use of dictionary look-up tables and contextual editing. The dictionaries are tailored according to the census question being processed. Thus, for example, the dictionary for country of birth question would only contain names of countries. Preparatory work on the construction of natural language dictionaries of terms will greatly increase the efficiency of coding;
(c) Operator repair can be undertaken on images not recognized. This is only cost-effective for those questions where there is a high probability that the repaired data can then be automatically coded;
(d) Automatic coding uses computerized algorithms to match captured responses against indexes. Those responses that cannot be matched are then passed to a computer-assisted coding process. For the responses that cannot be automatically coded, it is recommended to use a machine-learning algorithm that could replace human coders with as good and even better data quality and highly reduce cost. Data from a previous census can be used to train the machine-learning algorithm. Data from the current census testing cycle could also be used, especially if new variables need to be coded and can be verified to be of equal or higher quality than the one achieved using human coders.
(e) Further considerations on the use of digital imaging, ICR, repair, automated coding, Optical Mark Recognition (OMR) and Optical Character Recognition (OCR) are presented in the CES census recommendations for the 2020 round.22 Generative artificial intelligence can be expected to lead to new possibilities and replace keyers. However, it requires investment to build proper models to keep high quality of data.
5.6 Administrative data
5.6.1 Scope
269.
269. The technologies applied for the use of administrative data differ from those for data collection in the field. The development and increase in the availability of new information and telecommunication technology (ICT) allows administrative registers to be utilized more widely in population and housing censuses. Bearing in mind the development of state-of-the-art technologies and the commitment of agencies to implement innovative solutions in censuses in the 2030 census round, it will be necessary to create or modernize the software and hardware infrastructure for collecting, storing and linking data from administrative sources and storing metadata on processes and products.
270.
270. The quality of the source data has a large impact on the quality of output products. Therefore, the methodology for improving the quality of data from administrative sources, for example, by adjusting them to satisfy statistical requirements, is of vital importance. State-of-the-art ICTs may prove very useful here and have a key impact on improving the efficiency and effectiveness of these operations. For assessing the quality of administrative sources for use in censuses, reference is made to the UNECE guidelines on this matter, published in 2021.23
271.
271. As part of the preparatory work for the census, particularly in the design phase, the necessary technical requirements related to the use of data from administrative registers, which may affect the need to modernize infrastructure, should be determined in the following areas:
(a) Data collection;
(b) Data storage;
(c) Data linking;
(d) Storage of metadata on processes and products.
272.
272. The application of several techniques of collecting data from administrative registers and other sources for use in population and housing censuses will require a more comprehensive organization and management processes and more complex systems. Modern technologies provide opportunities for improvement in this case as well. The process of collecting data from administrative registers should include the preparation of a data-collection strategy using various data-collection modes.
5.6.2 Security and confidentiality24

24 See also section 7.1.1 Confidentiality principles.

273.
273. It is crucial to consider the growing emphasis on data security, privacy and data protection in society. This is evident in the legal and statistical frameworks of many countries. Moreover, EU member states must comply to the General Data Protection Regulation (GDPR).25 Consequently, there are increased requirements on how the agencies conducting a register-based population and housing census should receive and process data. To address these demands, a strong emphasis on the need-to-know principle and the processing of anonymized data is recommended.

25 Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data.

274.
274. Secure IT infrastructure is the necessary condition for data collection from administrative registers. A crucial issue connected with the process of data collection is the protection of data. Regardless of the technology applied, the data collection strategy should ensure information security. This requirement should be addressed at the early stages of designing the process of obtaining and gathering data from administrative registers and designing the proper software and hardware infrastructure. The technical issues concerning the coding of data transmission should be considered in detail, together with the use of secure transmission channels.
275.
275. Appropriate technical and organizational measures should be implemented to protect the data against accidental or unlawful destruction or accidental loss (including backups), alteration, unauthorized disclosure or access.
276.
276. Security must be implemented in multiple layers. Register owners, data-collecting NSOs and other partners must establish secure transmission channels with corresponding certificates to control access granted to collectors.
277.
277. By organizing data into different “states” - such as source data, prepared data, statistical data and output data - authorization management can be effectively implemented for various roles and teams within the NSO. Data access should be enforced on a need-to-know basis, with anonymized identifiers and disclosure controls applied across all aggregation levels.
5.6.3 Data linking, transfer and storage
278.
278. Modern technologies are useful in the process of linking records and data. After identifying the administrative sources to be used in the census, it would be necessary to map them and create application programming interfaces (API) for the automatic flow of data to the central.
279.
279. Administrative sources are different from each other, both within different subject areas within a country and between countries. If possible, it is advised to use standardized APIs. Regardless of the interface used or differences encountered, good metadata and good understanding of data are very important for their inclusion in the census.
280.
280. Many countries are currently transitioning to cloud-based storage and computing (see section 5.9). Compared to traditional on-premises solutions, this shift offers new possibilities, including the potential for a more flexible, variable-controlled universe of data, or data lakes, rather than traditionally thematically stored, structured and updated data. These data lakes will place high demands on metadata and accurate periodization, made more feasible by the increased processing power of cloud-based computing. Centralizing the administrative data into a data lake would support the business process and can provide a strong and reliable infrastructure for storing, creating metadata and synchronizing the different sources of data.
5.6.4 Improving the administrative data
281.
281. Various techniques are useful for converting administrative data into statistical data. With the procedure of automatic data cleaning in place, it is possible to eliminate errors in source data from administrative registers and edit the data efficiently.
282.
282. Machine-learning models could be used, for example, in determining the “census address” of the individual or type of the private household.26

26 Using machine learning methods to determine type of private household in Switzerland: ECE/CES/GE.41/2022/6.

283.
283. Artificial intelligence (AI) could be used, for example, for determining the classification of the economic branch of employment. Classification using AI can be more efficient, give quality results and allow classifying a high percentage of cases. The classification models used for this purpose need constant updates and quality analysis to evaluate the results. It is understood that specific security and confidentiality requirements may not always allow working with such data and models in the cloud. There are also known limitations of using classification models developed in English in the context of other languages.
5.7 Output production and dissemination
284.
284. Traditionally, census output comprises aggregated tables, statistics, illustrations and maps with appropriate metadata (see Chapter 6Chapter 7). The demand for evidence-based policy and planning generates a demand for census data from an increasingly wide range of users. Output systems therefore need to reach diverse users, ranging from those who would look for quick access to basic headline figures to those expecting to conduct advanced analyses.
285.
285. Online dissemination via the Internet additionally allows for the design of products to meet better the needs of different kinds of census data users, for the cost-effective dissemination of a much wider range of census data and for the improved usability of the data. Application programming interfaces (API) should be made available to increase usability of the provided data.
286.
286. Functionality and data content can be targeted to satisfy the different levels of users. This functionality should be seamless – from the simple to the sophisticated – with the users being led by the nature of the query or analysis they are wishing to undertake using different products.
287.
287. One of the main objectives of the census is to produce information for small geographic areas and for small population groups. Internet dissemination can support both types of use of the data. For small geographic areas, GIS technology can be used as means for both defining areas of interest in searching for data and for mapping of the outputs of the search. There is a range of software packages that can be used to zone in on populations of interest from large pre-defined matrix tables.
288.
288. The Internet dissemination system should provide flexibility for users to export the results into a range of commonly available packages for statistical analysis, tabulation or mapping.
289.
289. As the amount of data gathered increases, data visualizations have been introduced to help data users at all levels of experience to understand key information derived from the data. Data visualization can be used to communicate a message quickly, to simplify the presentation of large amounts of data, to see data patterns and relationships, and to monitor changes in variables over time.
290.
290. It is recommended to disseminate census data on interactive online platforms where different user groups can create their own tables and visualizations (e.g. graphs, maps) according to their needs.
291.
291. Focus should be on the application of the FAIR guiding principles for scientific data management and stewardship.27 The principles emphasize the capacity of computational systems to Find, Access, Interoperate and Reuse data with none or minimal human intervention. The analytical solutions in the platforms may remain limited as this is not the main purpose of the census.

27 GO FAIR Initiative. Wilkinson M, Dumontier M, Aalbersberg I et al. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data 3, 160018.

292.
292. Census data users have very different levels of statistical literacy or computer proficiency. It is recommended that the NSO should consider the following groups of users:
(a) Occasional users who mainly look at data visualizations or report highlights;
(b) Intermediate users who customise visualizations or select specific data;
(c) Technically advanced users who would download data to perform their own analysis and use application programming interfaces (API).
293.
293. International standards such as SDMX should be considered a priority for formatting the output data and metadata that is made available to users.
294.
294. Whatever the means of access or dissemination, protecting the statistical confidentiality of the census data is a prime consideration in any such systems and statistical disclosure control procedures should be in place.
5.8 Storage and archiving
295.
295. Data storage technology has evolved in line with new technology trends. Historically, data has been stored on-premises within the NSO, using traditional media tapes sometimes complemented by disk storage. However, as the volume of data has continued to increase exponentially, the capabilities of storage solutions have had to adapt. Many organizations are now considering implementing a hybrid storage approach, utilizing a mix of cloud and on-premises storage solutions based on individual needs.
296.
296. Certain factors will need to be considered in determining what storage policy is appropriate for individual statistical organizations.28 These include:

28 See also section 8.1 Benefits and costs.

(a) The risk for utilizing additional storage solutions that are off-premises.
(b) New skillsets required to develop and maintain advanced storage solutions such as a hybrid approach. These can be difficult to obtain and costly to retain.
(c) Each NSO should ensure that they understand their data in terms of data classification. This will aid the decision-making required when determining how to build their storage design capability;
(d) As organizations consider new storage solutions, cyber security safeguards are required to identify, detect, protect, respond and recover from cyber risks and threats. Their cyber defence strategy will need to ensure storage protections are available in a hybrid environment.
(e) There are many cloud-based storage approaches including glacier storage for long term retention of data that is not frequently used. Organizations that have relied in backing up all data using full, incremental or differential methods will need to understand to develop a deeper understanding of their data when leveraging the capabilities of these new storage approaches.
(f) In partnering with external storage service providers, NSOs will need to ensure appropriate agreements are defined that meet their expectations in service delivery;
(g) Storage and archiving should follow legal texts and well-established rules. For example, in the European Union, any scientific manuscript data should be kept for 10 years;29

29 Commission Decision (EU) 2021/2121 of 6 July 2020 on records management and archives.

(h) Backup and recovery procedures must be implemented and tested to perform any steps needed to restart the systems and services, to verify their operations and data integrity;
(i) A business continuity plan is recommended to be in place and tested for the prevention and recovery of the systems and their related data for threats like natural disasters or cyber-attacks.
5.9 Cloud
297.
297. Cloud computing is growing very rapidly and is expected to become the most common IT infrastructure in businesses globally. Cloud adoption strategies have ranged between cloud-only, cloud-first and, more recently, cloud-smart.30

30 UNECE 2024. . Geneva: United Nations.

298.
298. Cloud adoption is also becoming prominent for government entities worldwide, as they recognize the transformative potential of cloud computing in enhancing operational efficiency, scalability and service delivery. Embracing cloud technology allows them to optimize resource utilization, improve data accessibility and foster innovation across various domains.
299.
299. The development of the cloud has evolved to include multiple platforms and service models. Service models include Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). Platforms include public, private and community cloud. Each of these individual models can provide opportunities and challenges for statistical organizations in their adoption and deployment.
300.
300. NSOs have to consider and evaluate several elements for the adoption of the cloud.31

31 ibid.

(a) Data sovereignty – a requirement that a cloud service provider guarantees that cloud services and underlying infrastructure is designed to provide data access in compliance with laws and regulations of the originating country of the data in question. Global cloud providers rely on global contracts that may not align to local laws and legislation. There may be legal restrictions concerning the geographical location of the provider’s data centres.
(b) Jurisdiction in the event of a dispute. This will require localized agreements agreed at a national level.
(c) Data security measures and protocols the provider has in place.
(d) Perception among the public, politicians and stakeholders regarding data security, privacy and control with cloud-based solutions.
(e) Freedom to choose the best suited provider and consider the provider’s track record and reputation in the industry.
(f) Financial implications of the use of cloud.
301.
301. A transitional approach to the adoption of the cloud with incremental steps could be appropriate for many statistical organizations. Beginning with less sensitive applications and systems and learning from the experiences can provide a greater degree of comfort to the key stakeholders making the decision to adopt the cloud. Ensuring that the risk is assessed and measured on an ongoing basis will also help with the transition.
302.
302. Cloud infrastructure can bring important advantages if the census has a 5 to 10 years cycle. A benefit of using the cloud is the possibility to scale up and down easily when required. This could end up in cost saving for the census programme but requires nonetheless to proceed with caution and consideration of broader implications for the agency. For example, on-premises infrastructure built for the census could be reused for other purposes of the agency. So, for the census programme, there will be cost-saving possibilities, but it should be evaluated at the level of the entire statistical system. For a shorter cycle such as an annual census, it may not be the best approach.
303.
303. Further advantages are related to the investment to the new cloud technologies such as cloud-native and Platform as a Service. A multi-cloud architecture should also be considered in order of benefits from the strength of different cloud providers. For example, there could be one cloud infrastructure for the application and another one that will host the database.
304.
304. Cloud infrastructure allows a lot of flexibility but should be strictly managed in order not to pay for infrastructure that is not used. Otherwise, using the cloud may turn out to be the more costly option. Cloud servers should be scaled down when not used and scaled up when required.
305.
305. Cloud migration process should be well evaluated and well planned. Just migrating to the cloud to use another data centre would not be a recommended approach as it may cost more than staying on premises. Different countries should look at being cloud-smart, that is moving to the cloud by rebuilding the application into the cloud or leveraging on the cloud-native approach of building, deploying and managing application in cloud computing environments.
306.
306. Using the cloud can bring important advantages to the census but is not always the best approach. It is recommended to do a thorough evaluation before moving to the cloud.