The American Community Survey is the ambitious national survey from the U.S. Census Bureau that replaces the long form portion of the decennial census for the new millenium. While some version of this survey has been in the field since 1999, it was not fully implemented in terms of coverage until 2006. In 2005 it was expanded to cover all counties in the country, and the one-in-40 households sampling rate was first applied. However, persons living in group quarters (such as nursing homes, dormitories, and prisons) were not added to the survey until 2006. (The original plan was to begin GQ coverage in 2005 but last-minute budget reductions delayed it for a year.)
The full implementation of the (household) sampling strategy for ACS entails having the survey mailed to about 250,000 households nationwide every month of every year and was begun in January 2005. In January 2006, sampling of group quarters was added to complete the sample as planned (albeit several years later than originally planned). In any given year about 2.5% (one in 40) of U.S. households will receive the survey.
Over any five-year period, about one in eight households should receive the survey (as compared to about one in six that received the census long form in the 2000 census). Unfortunately, receiving the survey is not the same as responding to it, since the Bureau has adopted a strategy of sampling for non-response. This has resulted in something closer to one in 11 households actually participating in the survey over any five-year period.
Data based on the ACS surveys for any calendar year will be published in the late summer of the following year for geographic areas with a minimum of 65,000 population. For smaller areas, the Bureau will only publish data based on surveys for multiple consecutive years as follows:
In addition to the population threshold rules that are used to limit the publication of data for geographic areas, the Bureau also applies their data release rules for each table for each geographic area (that passes the total population threshold filter). Basically, they analyze the cells of a table and assign a measure of the statistical reliability of each cell based on the margin of error. The following excerpt from the Bureau's documentation outlines the method:
Data Release Rules
Another kind of data release rule, data quality filtering, applies to ACS 1-year and 3-year estimates. Every detailed table consists of a series of estimates. Each estimate is subject to sampling variability that can be summarized by its standard error. If more than half of the estimates in the table are not statistically different from 0 (at a 90 percent confidence level), then the table fails to meet the rule's requirements and is restricted from publication. Dividing the standard error by the estimate yields the coefficient of variation (CV) for each estimate. (If the estimate is 0, a CV of 100 percent is assigned.) To implement this requirement for each table at a given geographic area, CVs are calculated for each table's estimates, and the median CV value is determined. If the median CV value for the table is less than or equal to 61 percent, the table passes for that geographic area and is published; if it is greater than 61 percent, the table fails and is not published.
Whenever a table fails, a simpler table that collapses some of the detailed lines together can be substituted for the original. If the simpler table passes, it is released. If it fails, none of the estimates for that table and geographic area are released. These release rules are applied to single- and 3-year estimates, but are not applied to the 5-year estimates.
To access the data within the MCDC data archive via Uexplore/Dexter, go to the ACS section of the Uexplore/Dexter home page and follow the link to the desired vintage (e.g., follow the acs2014 link to access data based on 2014 vintage data).
The acs directory is generic (i.e., not time or data product specific) and contains a number of interesting documents.
For many/most users for many/most queries casual users would probably be better off using American FactFinder (AFF). If you are looking for basic data for just a few geographic areas, then using our ACS Profiles and/or ACS Extract assistant apps (see below) are hard to beat for ease of use combined with good flexibility. Accessing the more complex summary (base tables) is much more challenging on our site. You must use the Uexplore/Dexter applications to access two separate subdirectories of the acs[yyyy] data directories (e.g., we have acs2014/basetbls and acs2014/btabs5yr) where we keep these base tables. Within these subdirectories, you'll find data sets that come in groups of six, with names ending in 00_07, 08, 09_16, 17_20, 21_24, and 25_28. These are topic intervals. For example, a data set ending in 17_20 will contain all tables relevant to topics 17 (poverty), 18 (disability), 19 (income), and 20 (earnings). Not everybody who needs to use ACS data knows or wants to know about topic codes. They should use American FactFinder.
One of the keys to using these large datasets is knowing what tables are available, and within these tables, what each of the data cells represent. For this we have created variable metadata files in each base tables subdirectory.
The first two digits of a base table number are the topic code. So, if you are looking for tables related to poverty (for example), you need look only at tables B17xxx and C17xxx. These tables would be found in a data set such as ustabs17_203yr (three-year period estimates with all tables in topics 17 through 20). The topic group is part of the data set name.
Base (summary) table names comprise a letter (B or C), a five-digit code (the first two of which constitute the topic code), and sometimes an alpha suffix (to indicate a special race/Hispanic universe (per the list above). There may also be a "PR" suffix to indicate a file available only for Puerto Rico. We do not include PR tables in these metadata files.
Once you know that you need to access the base (summary) tables, and you have determined which table(s) you need, you are ready to do a Dexter query. This is the relatively easy part. You will need to locate the appropriate base tables subdirectory. Then just select one of the ustabs[topic-interval] data sets to invoke Dexter to access that collection summary/base tables. When the Dexter query form page is displayed, you will see in Section III a tables select list with table titles in the select list..
Both the Census Bureau and the Missouri Census Data Center provide data profile reports (and corresponding data files) containing highlights of the very detailed information contained in the complete set of base tables. The Bureau profiles can be accessed via American FactFinder. The MCDC's ACS applications include ACS Profiles and ACS Trends.
The ACS also includes a public use microdata sample (PUMS) product. We keep all such datasets (regardless of year) in the acspums data directory. This collection will only be of direct interest to researchers with access to and knowledge of how to use a statistical software package. For a more detailed discussion, see the ACS PUMS section of our Ten More Things to Know... web page.
Users of this web site will notice that we place considerable emphasis on data summarized at the PUMA geographic level. This is because PUMAs are large enough (100,000 minimum population) that they qualify to have new single-year ACS data published each year. We can use PUMA data, therefore, to look at trends and maps that cover the entire state. We have created custom reports to help users understand where these PUMAs are (what counties and cities they contain or in which they are contained), together with links to PDF map files showing them (see our geographic reference reports). To learn more, see the discussion of PUMAs on our Ten Things to Know About the American Community Survey page or our All About PUMAs page.
There has been and continues to be a lot written about the ACS. Here are some of our favorite resources.