Teresa A Sullivan. Census 2020: Understanding the Issues. Springer, 2020.
Sticking to traditional approaches within the demographic research community might prevent further progress, or just let other, bolder, communities of scholars bring the advances needed to further our understanding of population processes …A fruitful way ahead is perhaps to combine traditional approaches with new one: counting and now-casting, indirect estimation and the used of non-representative Web-based data, official statistics and digital breadcrumbs. (Billari and Zagheni 2017: 176).
As important as the U.S. Census is, and as significant as its Constitutional mandate is, the census is but one of multiple data sets available today to government officials, business leaders, community organizers, teachers, and the general public. Chapter 5 reviewed the dangers that the proliferating data sources may pose in terms of allowing individual reidentification of someone included in the census. These proliferating data bases offer a totally different possibility as well: replacing the decennial census with another approach or by combining different approaches.
The Continuous Population Register
One alternative to the census, the continuous population register, is used in some countries, such as those in Scandinavia (Poston Jr. and Bouvier 2017, p. 46-54). The population register tracks every member of a population from birth until death. Although countries vary somewhat in the information that they register, some typical possibilities include school attendance and school leaving; address and changes of addresses; military service; marriages and divorces; and eligibility for and receipt of government benefit programs. Depending upon the country, eligibility for health care, pensions, further education benefits, and other programs might be included.
The Chinese tradition of registering population dates back to the Han dynasty (206 BC – 220 AD) and was adopted elsewhere in Asia (Taeuber 1959, p. 261). In Europe the continuous population register originated before modernity in religious parish records (Shryock et al. 1976, p.13). Later the population registers, sometimes termed civil registers, were maintained by local and then national governments.
Sweden forms an exemplar of this change from religious to secular record-keeping. By royal decree, births and deaths have been recorded in Sweden at least since the 1600s. Until 1991 the Church of Sweden (and later congregations of other faiths) maintained the register, and since then it has been maintained by the Swedish Tax Authority. Any change in residence longer than six months must be recorded, along with place of birth, citizenship, immigration to or emigration from Sweden, and a personal identification number (PIN). This PIN is used for most interactions with the government. Individuals in Sweden have the legal right to see any information contained about themselves in the registry. Because of the length of time this registry has been kept, it has multiple medical, historical, and demographic uses.
Advantages of a Continuous Population Register
A census is a snapshot of a country at a specific point in time. In the United States, this snapshot is available only once a decade. In reality, however, populations are always in motion: people within the population are continually being born, dying, and moving from one location to another. The population register better represents this dynamism. Moreover, because the register is always being updated, there is no need to ramp up every decade to take the census. The cost of maintaining the register could well be substantially less than the cost of planning, testing, advertising, conducting, and then analyzing the census. The need for parallel agencies, such as a census bureau and vital statistics bureaus, could be reduced.
Disadvantages of a Continuous Population Register
To be sure, the population register is not free of error. Potentially an illegal alien could avoid inclusion in the register, and individuals within the population could deliberately or forgetfully fail to register every event when required. While a single system is efficient, the existence of parallel systems (e.g., census and vital statistics) permits the developments of quality checks of the data.
The multitude of governments within the United States is also potentially an issue. Would a population register be the province of the federal government, which is currently responsible for the census, or of the fifty-plus vital records bureaus now run by the states and the District of Columbia? And if the latter, how would coordination be achieved?
Population registers need to account for migration, and the United States is a very mobile society. Americans are used to the idea of registering births and deaths with their states, but a requirement to register every move from one apartment to another would not easily fit Americans’ mobile lifestyles.
Moreover, having a single PIN that could be used in every government transaction could seem an invitation to cyberthreats. The Social Security number does appear to function as a single PIN for many government functions and hacking of Social Security numbers has been a major cyberthreat.
Partial Population Registers
The United States has many administrative records that function as partial population registers. An administrative record consists of data collected for a particular governmental purpose, such as conducting a particular program. Unlike the continuous population register, these administrative records are partial registers because they do not cover the entire population. And unlike Title 13 data, which are collected solely for statistical purposes, the administrative data are originally collected for a different purpose but might become useful for statistical purposes.
An example of a partial population register is driver’s license records. The states issue driver’s licenses to eligible individuals, typically people over a minimum age who have passed required tests and paid a fee. The principal purpose of the driver’s license is to show that a person has the appropriate qualifications to drive a motor vehicle. Driver’s licenses have many other uses beyond those of the Motor Vehicle Department. The driver’s license is used as a form of identification, and it is sometimes used to register to vote, to apply for a job, and even to indicate an intent to become an organ donor. The registry of driver’s licenses in many states covers a large fraction of the population of driving age.
Many federal administrative records cover a large enough portion of the population to qualify as partial registers. The Internal Revenue Service has tax records for most adult Americans and many children. Social Security, especially Medicare, is believed to have nearly universal coverage of the senior citizen population. Should Medicare-for-all become a public policy, then there is a possibility that Medicare could function as a continuous population register. Selective Service is a partial register for men over 18 years of age, although it is not kept up to date for men older than the age at which the former military draft operated. The Veterans Administration has data on veterans and their service records.
Other partial registries are kept by the states. Some examples include state tax records, land ownership records, motor vehicle registries, hunting and fishing licenses, and voting rolls. In some states there are also registries for certain occupational licenses, school attenders, concealed handgun permits, convicted sex offenders, and recipients of various benefits such as Medicaid. Unlike confidential Title 13 data, some state registers are required to be publicly available under the state’s Freedom of Information Act (FOIA).
There are also non-governmental records that might be considered partial population registers. Credit agencies, although private, cover a large fraction of the population, and include information on name, address, income, occupation, Social Security number, and sometimes co-borrowers (often a spouse), in addition to the expected information on the status of credit accounts and pending legal actions. It seems increasingly likely that as social media cover more of the population there will be more ways in which a Facebook or Twitter account might be “scraped” as partial registries.
Given the discussion of differential privacy in Chap. 5, the reader has probably already anticipated that there are many ways to misuse administrative data and that some safeguards are necessary.
The regulation of Electronic Medical Records (EMRs) is instructive. An EMR assembles information from the various medical visits and hospitalizations of a patient. Current medications, test results, and diagnoses are included. A health care provider has immediately at hand the patient’s medical history and can add symptoms, vital signs, and other information. This information saves time, minimizes prescription errors, and generally provides better care. For a patient who is brought unconscious to a hospital, the EMR can be a lifesaver.
On the other hand, the EMR takes some of the most sensitive information about a person and combines it into an electronic file that could potentially be hacked or otherwise misused. To protect medical information, Congress has passed a stringent act called HIPAA (Health Insurance Portability and Accountability Act). Among other things, HIPAA establishes industry-wide standards for electronic billing and other health care information, and it requires the protection and confidential handling of this information.
Linking the Partial Registers
Given the care and effort devoted to safeguarding this one type of medical information, and the issues around privacy and confidentiality discussed for census data in Chap. 5, it is apparent that the linking of registers is fraught with issues. Linking records is not technically difficult. In fact, it is so easy to do that there are laws that limit the linkage of certain records, that specify the permissions required for linkage, or that list the procedures that must be followed to allow the linkage. 3 Why this linkage is significant for Census Bureau operations is discussed below.
Census Bureau Use of Administrative Records
The Census Bureau accesses the federal partial population registers and other administrative data for many of its functions. In fact, Title 13 of the U.S. Code authorizes the use of administrative data instead of direct inquiries “to the maximum extent possible with the kind, timeliness, quality, and scope of the statistics required.” Because of this authorization the Census Bureau is proficient at accessing the records of other federal agencies. This is, by the way, a one-way street: information can come from a federal agency to the Census Bureau, but that same agency cannot requisition census records.
Congress has expressed a wish to reduce the respondent burden or survey fatigue that comes from repeated questioning. Respondent burden is often cited as a reason for shorter questionnaires, for paperwork reduction requirements, and for requiring government documents to be written in plain English. The development of the American Community Survey and the retirement of the census long form represent a desire to reduce the burden on respondents while having more timely information. Filling in the answers to census or survey questions with answers that are already in an administrative record is cost-effective and requires less time and effort of the respondent (Ortman 2018).
The average American may feel besieged from the sheer volume of commercial messages, robo-calls, solicitations, and surveys from non-governmental organizations. While this deluge of requests is not the fault of the Census Bureau, the Census Bureau must nevertheless deal with the declining response rates that result when Americans are simply fed up with answering one more questionnaire.
As Chap. 3 indicated, as part of non-response follow-up in the 2020 Census, the Census Bureau is testing the use of administrative records to impute race, age, and Hispanic origin if these pieces of information are missing. Information from administrative records will also be compared with census returns as an error check. More extensive use of administrative records with the ACS is also planned. And for many years the Census Bureau has used administrative records such as birth and death records to make estimates and projections of population size between censuses.
Not All Information Is Treated the Same Way
Federal agencies participate in information sharing programs that are limited by their statutory authority. Non-confidential information is shared among agencies that face similar issues, such as hacking or cybersecurity problems and solutions (Rockwelll 2017). An important fact to reinforce is the one-way nature of the linkage of individually identifiable data to the Census Bureau. Because of Title 13, data about households or individuals that comes into the Census Bureau also comes under the shield of confidentiality. This means that the census information can be used only for statistical purposes.
Developing a Citizenship Count (CVAP)
With this background on registers, administrative records, and linkages, we return to the issue of citizenship and the 2020 Census. As explained in Chap. 4, the Supreme Court struck down the addition to the 2020 Census of a question about citizenship. About two weeks after the Supreme Court’s decision, President Donald J. Trump issued an executive order instructing federal departments and agencies to provide to the Census Bureau the citizenship data that they already held in their databases. President Trump said, “Some states may want to draw state and local legislative districts, based upon the voter eligible population” (Rogers et al. 2019).
The first release of data from the 2020 Census will be Census Unedited File, which is the population counts by state, scheduled for preparation by November 30, 2020, and release to the President and Congress by December 31, 2020. What will be contained in this file is the count of the population for each state, together with overseas federal employees and their dependents. That overseas population will be allocated back to its state of residence. These data are then analyzed with the apportionment formula to produce the number of representatives allocated to each state. The CUF does not contain any citizenship data.
Redistricting data at the block level will be produced in the Census Edited File, which will be released state by state between February 18, 2021, and March 31, 2021. The Census Edited File differs from the Census Unedited File because it has used administrative records and some statistical modeling for imputing missing values. The Census Edited File goes through the Disclosure Avoidance System to minimize the possibility of identifying any individual in the data.
The President’s Executive Order 13880 commits the Census Bureau to release the Citizen Voting-Age Population (CVAP) data by March 31, 2021 (U.S. Census Bureau 2019). This is basically the same time frame as the release of the Census Edited File (CEF). CVAP will combine administrative data from a number of federal agencies into a separate microdata file that will contain a “best citizenship” variable for every person in the 2020 Census. The Disclosure Avoidance System will be used with this microdata file. The same confidentiality rules that apply to the CEF will also apply to the citizenship variable. The data will be produced at the block level and will be available to the public.
“Best Citizenship” Variable
There is an internal working group in the Census Bureau that will release the specifications for the CVAP by March 31, 2020. The Census Director has convened the Interagency Working Group, consisting of high-level executives in federal agencies whose databases have person-level data relevant to estimating citizenship. To be useful the administrative records will need to have variables that contain citizenship data and that can link to names and addresses on the census record.
Among the possible sources of citizenship data that the Census Bureau will examine are the following (U.S. Census Bureau 2019):
- Social Security Administration NUMIDENT, which contains place of birth and citizenship status for approximately 94% of the population
- Internal Revenue Service 1040 and 1099 forms, which would be used for purposes of most current address
- CMS Medicare and Medicaid/CHIP, which contain some citizenship data but are needed for current address
- Housing and Urban Development, which potentially contains current address information from Federal Housing Administration, Public and Indian Housing Information Center, Tenant and Rental Assistance Certification System, Lowincome Housing Tax Credits, and Computerized Homes Underwriting Management System
- Department of Homeland Security USCIS/CBP/ICE, which contains information on lawful permanent residents and naturalization data (CIS), visas (ICE), arrival/departure (CBP)
- Department of State (Passport Services), for citizenship data from passports
- Social Security Administration, for information from the master beneficiary record
- Indian Health Service, for patient registration data
- Department of Justice, for US Marshals and Citizenship and Immigration Data Collection
An important part of the research program at the Census Bureau will be the statistical modeling needed to combine citizenship data from various sources to produce the “best citizenship” variable. This variable is the “best” in terms of providing the best estimate of whether someone is or is not a citizen. Some examples of variables that could go into this model are place of birth, naturalization data, and passport information.
Linking Databases
A second important research effort will be improving the linkage of names and addresses from these different databases. There are many issues here: people with the same name, people who have had many addresses, people who moved just before the census (and before they notified Social Security and other agencies of the address change). And the process will need to be repeated for millions of people over the age of 18.
The Census Bureau is still studying the linkage issues, and so what appears below is conceptual guesswork about how the process could work. Suppose the issue is whether the next person in the census count should receive a best-citizenship designation as “citizen” or “non-citizen.” Let us hypothesize that this person is named John Smith and the census return shows that his household resides at 1234 Main Street in Wallbridge, Texas. There are many John Smiths in the various data files, so the computer searches for Main Street and Wallbridge, Texas. If there is a match—let’s say with Social Security records—and it shows the same name and address, and a birthplace in Tyler, Texas, then presumptively John Smith can be designated a citizen. People born in the United States are U.S. citizens.
Suppose on the other hand, that John Smith was born in Germany, where his father was stationed on military service. Children born abroad to U.S. citizens are also citizens, but discovering whether John Smith qualifies may be more complicated. If John Smith holds a passport, that will settle the issue by linking to the passport data. If John Smith does not hold a passport, then additional searching through the other government databases will be necessary. He will not, for example, be a visa-holder because he is a citizen.
Suppose alternatively that the Social Security record shows many John Smiths, none of whom is currently living in Wallbridge. Perhaps an IRS record will prove a match for the Main Street address in Wallbridge and provide a Social Security number. It is very likely that John Smith uses his Social Security number as his Taxpayer Identification Number. That number could link into the Social Security information and to show that John’s birthplace is in Texas. A matching program will need to be developed that could potentially go through 3, 5, 7 or more databases before finding a name and address match, and then go back through the databases looking for citizenship data.
The most difficult cases will be for the truly undocumented, for whom there might be no more information than the census record. For the person who is truly undocumented in any other government database, the issue will then be whether this person is presumptively a non-citizen, or simply someone for whom the matching algorithm failed.
Given its history of seeking to identify errors and improve data, the Census Bureau will almost surely launch an effort to evaluate the success of the matching program, particularly in identifying false positives (non-citizens labeled citizens) and false negatives (citizens labeled as non-citizens). The number of unsuccessful matches is likely to vary from state to state. For a state with a large net migration, such as Florida or Texas, the address matching could be particularly problematic. A large mismatch rate, in turn, would raise concerns about the fairness of redistricting using the CVAP. It is difficult in advance to determine the data quality issues that could arise in a database that has not yet been produced.
The legal issues around whether the legislatures may then redistrict using CVAP are not entirely clear, although the Rucho decision in 2019 might mean that the courts will take a hands-off position. Almost surely there will be legal challenges before state legislatures complete redistricting with the new “best citizenship” variable.
Do We Still Need a Census?
If the Census Bureau can correct census data with administrative records, and if the Census Bureau can successfully create a best-citizenship variable constructed entirely from administrative records, then it appears to be feasible for the Census Bureau to construct a database that appears similar to a continuous population register.
Such a continuous register would require the cooperation of vital statistics bureaus, located in the states, to document when someone enters the continuous register through birth and when someone leaves the register through death. The Department of Homeland Security could provide information on legal entrants to the United States and on legal entrants who leave the United States.
Problems would remain that have to do principally with the mobility of the population. U.S. citizens who emigrate to other countries, while typically relatively few in number, are harder to count unless, perhaps, they are receiving a Social Security check abroad. Internal migration will raise the issue of permanent addresses for the register. In particular young people move a lot as they go to college, join the military, find and change jobs. Then they often change residences as they progress through the family life cycle. Similarly, retirees are often a mobile population, sometimes having both summer and winter homes or downsizing and moving to be near adult children.
And the problem of illegal entrants would remain difficult. A foreign national who enters the United States undetected would probably not be included in the register. A foreign national who enters on a tourist visa and then overstays the visa could potentially be identified, although perhaps without a useable address.
The geographic location issues are critical if something like a continuous population register were to replace the census. Even if the Constitution were to be amended to eliminate the census in favor of a continuous population register, there would still be a need to reapportion the House of Representatives and to redistrict the states.
The CVAP will be an important development to answering some of these questions, even if it is never used to redistrict a single state. A successful CVAP will represent the feasibility of linkages among government databases to account for every resident in the country. With the addition of successful federal-state linkages—an addition that presents both legal and technical challenges—the CVAP would represent a prototype continuous population register.
And assuming that Title 13 would remain in effect and would apply to CVAP or whatever population register would replace the census, then the confidentiality of the information would be legally assured.
Summary
Censuses are not the only way to provide statistics about the population. Population registers have been used successfully in many parts of the world. The United States has administrative records that were designed for specific purposes, but that nevertheless have potential to serve as partial population registers. A number of federal agencies have such databases, and the states have their own partial registers.
President Trump’s Executive Order to produce a citizenship voting-age population (CVAP) file to be provided to the states offers two challenges to the Census Bureau. The first is to develop a “best-citizenship” variable relying on administrative records rather than asking a citizenship question in the census. The second is to develop successful linkages and record matches with many federal databases.
Assuming that the CVAP process is successful and that the mismatch error rate is considered tolerable, then these developments raise the issues of whether the United States can and should develop a federal continuous population register. Such a register would require additional record linkages to the states’ vital statistics bureaus. It would be subject to its own errors and shortcomings. Most importantly, for the continuous population register to supplant the census, an amendment to the United States Constitution would be required.