Missouri Census Data Center

The American Community Survey vs. the Decennial Census Long Form

Are We Better Off Now Than We Were a Decade Ago?

John Blodgett, OSEDA, June, 2009

What's New

It's getting close again. We are less than a year away from the next Decennial Census day - April 1, 2010. Which means we are less than two years away from seeing the first tabulated results of that once-in-a-decade survey. Less than two years away from knowing with near certainty just how many people live in our counties, our cities and our neighborhoods, regardless of size. Not only will we know the head count but we'll also know important demographic details of the population: age, race, gender, hispanic origin, living in a household (and whether it's a family household) or in group quarters (what kind), renters or owners, etc. Or at least we'll know these things as they were circa that point in time - April 1, 2010. So we can easily spot trends by linking to comparable data gathered on April 1, 2000 and even April 1, 1990.

What else are we looking forward to getting from the new census? Won't we also be getting new data regarding things such as people's incomes, poverty status, education, occupation, employment status, house values, and disabilities? Well, no. Not from the decennial census, not this decade.

Because now that we have the ACS (the American Community Survey - see www.census.gov/acs/www/index.html for details) we won't be needing that portion of the decennial census where these kinds of data have been collected and reported for decades. The long and short of it is there will be no long form questionnaire used for the 2010 census, only the short form with its seven basic questions per person. Fewer questions being asked about fewer subjects results, of course, in fewer tables dealing with fewer subjects being published based on the survey.

But we do have data regarding these subjects based on the American Community Survey. The ACS was created so that we would have such data not just once every ten years, but once every year of the decade. We have been getting such data based on the ACS since 2006, based on surveys conducted nationwide starting in 2005. So does this mean we are not going to miss the long form (aka "sample") data from the decennial census? The answer is not so simple. It depends on what you are looking for. Perhaps the most critical factor is the size of the geographic areas for which you want to see data. If you are a person who focuses on the "big picture" - what is happening nationwide or statewide - or even city or countywide for larger cities and counties then you are probably going to be pretty happy with the ACS as a replacement for the Long Form. But if you need data for smaller areas, for your town of 10,000 people or your local wards, planning areas or transportation zones then you may not be as happy with what's new.

What's The Difference?

Back in 2002 when the first long form-based/sample results were published based on the 2000 Census, users were able to get detailed socio-economic data such as median incomes and poverty rates for all kinds of geographic areas. You could get data for the nation, your state, your county, your town or township (county subdivision), your ZIP code, your neighborhood (census tract or block group), your city (regardless of size), etc. These data were delivered in a data product called Summary File 3. The data were published for every county, every city, every town or township, every ZIP code, every census tract and block group in the country. There was no suppression in the data. There were some real issues with sampling error for very small geographic areas, but for the most part no attempts were made by the Census Bureau or those intermediaries providing access to these data to call attention to potential problems due to such issues. (There are links to data notes when accessing SF3 tables in the AmericanFactfinder web application, but nothing that would allow you as a practical matter to readily evaluate the data with respect to margins of sampling error.)

We do not exactly have something comparable to Summary File 3 based on the ACS yet. One could argue that we never will. We do have a set of base (or "detailed") tables and a set of standard extract/profiles available via AFF (the Census Bureau's American FactFinder web site) that are similar to the kinds of data that can be extracted from SF3. But the similarities are (for now, at least) probably outweighed by what's different. The differences include:

  • You have to choose between data based on a single year (2007) of data or based on 3 years of data (2005-2007). The latter means you will be accessing multiple year period estimates, meaning data based on some kind of 3-year "average" of the characteristics being reported. The single-year data are available for the 7,199 geographic areas with populations of at least 65,000 while the 3-year period estimates are available for about twice as many areas -- those that meet the 20,000 population threshold.

  • Data are not yet available for around 1400 of the nation's over 3,100 counties and for over 90% of the nation's places (incorporated and census-designated). There are no data at the ZIP code/ZCTA, census tract or block group levels. Data for these smaller areas will not be available until there are 5 years of survey data available, which means 2005-2009 and this will not be published until very late in 2010 or perhaps early in 2011.

  • As far as what kind of information is contained within these tables, you could say that the sf3 tables and the ACS base/detailed tables are substantially comparable. There are more tables in the ACS collection than there were in SF3. But in the decennial census we also had a product called Summary File 4 that had many additional (generally more detailed) tables. The ACS detailed/base tables are pretty much comparable to the combination of SF3 and SF4 tabulations. There are, however, two significant differences between SF3 tables and ACS base tables:

    1. The ACS tables always come with MOE's - Margin of Error measures. This is not a problem in itself, but it is a symptom. It reflects the potential sampling problems associated with the smaller sample size upon which the ACS estimates are based.

    2. There are many tables (most notably for smaller areas of, say, less than 100,000 population) that have been suppressed due to being deemed not having a sufficient sample size. The Bureau has promised that when the 5-year period estimates are released that there will be no such suppression. Except for a few exceptions (so far).

How The ACS Is Better Than The Decennial Long Form

  1. Timeliness for areas that are large enough to get new data every year. That means any geographic area with 65,000 or more population. With the 2000 decennial census we had excellent data for entities such as the city of Columbia, Boone County, and the state of Missouri, as delivered in 2002 based on the 2k census. But the data that was excellent in 2002 became less so as the decade went on and it was being used as a proxy for current data. The ACS provides us with a new set of refreshed data every year. The sampling error is larger than it was with the decennial census but in most cases for larger areas it is quite acceptable. If you are concerned regarding the sampling error you always have the option of using the 3-year period estimates from the ACS as an alternative, sacrificing some data currency for the much larger sample size.

  2. New questions can be added to the survey without having to wait for the decade to change. For example, we were able to add a question regarding health insurance coverage for the 2008 survey. We’ll be able to see that data later this year (2009), for areas of 65,000. (We won't be able to see it for another 4 years for ZIP codes and small cities because we'll need the question to be on the survey for 5 years).

  3. Generally speaking, for those doing “big picture” policy analysis it is a significant improvement to have such a rich source of socio-economic data refreshed every year. We have data for the nation, states, congressional districts, larger cities and counties, metro areas and even PUMAs every year. Not to mention the PUMS (Public Use MicroSample) files (released each year as part of the ACS product package), which are an invaluable resource for decision-makers as well as academic researchers.

How The Decennial Census Long Form Is (Was) Better Than The ACS

  1. The sampling error associated with the decennial census long form data is much lower (in general) than that of the ACS. The actual number of households from which the Bureau gets completed ACS questionnaires (not to be confused with the number of households that merely receive them) is running about 1 in 11 over any 5-year period (or about 1 in 55 for any single year.) The sampling rate is declining slightly over time since the ACS is funded for a fixed number of surveys each year rather than a fixed percentage of households; so as the nation's households count increases each year, the ACS sampling rate declines.

  2. Because of the significant sampling error that can occur in the ACS (for certain geographic areas / table universes) the Census Bureau has adopted a policy of suppressing tables that are deemed statistically unreliable. This problem is made worse by the Census Burea’s use of an algorithm for measuring unacceptable reliability that many users and statisticians do not think is always valid. (See item 6 in Ten Things to Know...).

  3. Data for smaller geographic areas (especially those under 20,000 as well as for all ZIP codes, even those with populations over 20,000), are only released as 5-year “period estimates”. Because of under-funding of the ACS in the first half of this decade we did not start getting complete ACS data until 2005, meaning we shall not have the required five years of data needed for publishing small-area data until the 2009 surveys are processed. This means we are not seeing these data now and will not until some time very late in 2010 or perhaps early in 2011. Over half the counties in Missouri have no ACS data yet.

  4. The ACS data are weighted in such a way that they conform to the official Census Bureau population estimates at the county (or contiguous-county group in the case of very small counties) level by age, race, sex and Hispanic origin. "Official" does not always mean reliable, and if those estimates are not accurate (as past history has shown they sometimes very well can be) then the ACS data will not be used to improve them, but instead will just reflect them. In the decennial census this was not a problem since there was a complete-count census taken at the same time as the long-form sample and it was a relatively simple matter to do weighting of the sample data using the short-form data.

  5. For anyone needing reliable socio-economic data for small areas (especially those with populations under 20,000) the news is not good. The data will arrive later than expected (short term), with a good possibility of many tables being unusable due to sampling error and/or suppression based on sampling issues. Because the data are based on a sample taken over a 5-year period it will be impossible to use the ACS data to pinpoint areas that may be undergoing significant changes over the period. (Of course you could not get such trends with decennial long form data either since you only got one set of data every ten years, but at least with the decennial you knew what the numbers were for that one point in time; with the ACS the numbers are always going to be "fuzzy".)

  6. Comparability of the data with that derived from other sources or surveys (including the decennial census) is a problem in many cases. "Period estimates" take some getting used to (unless maybe you'are an economist). Even single-year data are period estimates in the sense that they are based on taking 12 monthly surveys instead of a single "snapshot" survey where everyone is instructed to pretend it is April 1. There are also issues of how to compare all of the different flavors of ACS estimates (especially the 1-year vs. 3-year vs. 5-year data, not to mention how to do a trend based on a series of period estimates that overlap.)

The Bottom Line

So what is the answer to the question posed in the subtitle? Are we better off with the ACS instead of the Long Form? Based on what you've read on this page you might be inclined to say the answer is No. The ACS has got a lot of things that do not compare favorably with the old reliable Long Form. But just because we listed 6 things where it was inferior and only 3 where we judged it to be better, that does not necessarily mean the bottom line says it loses. You really have to assign weights to these comparison items to reflect their importance. The one item of those discussed that can outweigh all the others is the first one in the Why the ACS is Better section: 1. Timeliness for areas that are large enough to get new data every year. This aspect of the ACS is really why it was invented. The details about how it was going to be done, having to do with sample sizes, weighting schemes, data suppression rules, etc. were all minor details. Most of the negative things we have said about the ACS here really do not apply much when you are looking at data for the nation, for a state, a major city or metro area. Problems with sampling just don't make much difference at these levels. It will be very helpful to get ACS data for 2009 in the fall of 2010, so that we can study in great detail the effects of the economic recession across the country. You may not be able to study it down to the neighborhood, but that is not an appropriate level for the typical "big picture" analysis anyway.

And even at much smaller levels, there are those who would argue that these data really aren't all that much worse than the SF3 data, we are just more aware of it because of the emphasis being placed on MOE's and other measures of sampling error. I do not agree with that viewpoint, but I have heard it.

By far the most significant negative aspect of the ACS as a replacement for the long form is the lack of good data for smaller geographic areas. The 2000 decennial census allowed us to know (for example) that of the approximately 6,800 residents of Livingston, Montana about 12.1% had incomes that put them at or below the poverty level, and that 4.6% had incomes that were below 50% of that poverty threshold. Some time late in 2010 we should be getting some new figures of this sort for Livingston, but it's going to be much less convincing, assuming it is not suppressed. It will be based on data collected going back to January of 2005. It might well be that the poverty situation of 2005 through 2007 may have been somewhat different than that of 2008-2009; but such differences will all be smoothed over with period estimates. We'll eventually be able to adjust the focus to try and get a clearer picture of poverty in Livingston (or any of the other tens of thousands of American towns with populations below 20,000) as we get those annual "updates" that are only 1/5th new and 4/5 the same as last year.

We are going to have to make adjustments as we learn to accept and live with data that are spread out over five years. We understand the concept, and we understand that there are an awful lot of small rural counties, small towns out on the prairie, and even established neighborhoods in older cities and suburbs that change so slowly that taking a snapshot with a 5-year time exposure will not significantly blur the picture at all. Some places never change, or evolve so slowly that it takes years to notice anything. If you are interested in tracking change for small areas it is going to require having ten consecutive years of ACS data so that you can look at data for two adjacent and non-overlapping 5-year periods. That should be something we can do starting around 2015.

References

This file last modified Tuesday October 11, 2011, 10:47:46

Site Map    |    Our URL    |    Rate this page/site
The Missouri Census Data Center is a sponsored program of the Missouri State Library within the office of the Missouri Secretary of State. The MCDC has been a partner in the U.S. Census Bureau's State Data Center program since 1979.

Questions/Comments regarding this page or this web site are strongly encouraged and can be sent to