Missouri Census Data Center

Notes On the NCHS Bridged Race Population Estimates

Rev. 08/05/13

General Information

These estimates were commissioned by the National Center for Health Statistics and generated for them by the U.S. Bureau of the Census. They are alternative versions of the estimates done by the Bureau in the "casrh" series - county, age, sex, race and hispanic origin. These are annual population post-censal (mostly - with some inter-censal) estimates at the state and county levels for the 4 demographic categories mentioned (age, sex, race and hispanic origin). These numbers have been generated for the years 1990 through (at the time of this writing) 2011, with new values being generated with an approximate 1-year lag. (For example, the estimates for July 1, 2013 should be available circa July, 2014.) These estimates differ from the standard casrh estimates in two critical ways:

  1. The race categories are very different. The "bridged race" categories used on these files are:
    • White
    • Black or African American
    • American Indian, Eskimo, Aleut
    • Asian & Pacific Islander.
    . There is no separating the Asians from the Native Hawaaians or other Pacific Islanders, and there is no "Other" race category -- these have all been allocated to one of the other 4 categories (in a process called "bridging"). Note that there are no multi-race categories. Using bridging techniques all persons who indicated they were of multiple races were re-assigned to a single race group. Detailed methodology is available from the NCHS web site.

  2. While the commonly-available numbers in the casrh series use 5-year age cohort categories, these estimates are for single years of age, except for the 85-and-over category.

These numbers are derived from the same basic source as the other official Census Bureau population estimates, and where comparable demographic categories are used, the numbers should match. For example, if you sum all the estimates for hispanic persons for a given county across the Age categories 00 through 04, you should get the same number that appears for that 0-4 cohort, hispanic, for that county. Detailed methodology descriptions is available at NCHS File Documentation web site.

We now have these estimates for all of the decades of the 1990's and 2000's (2000-2009), and we add post-2010 data as it becomes available. For the decade 2000-2009 we offer both the original post-censal estimates, as well as updated inter-censal estimates (see the Census Bureau web page regarding these updates if you are not familiar with the concept.)

In addition to all these between-census, July 1 estimates NCHS also provides comparable data based on the 2000 and 2010 decennial censuses. We stored all these decennial data in single national data sets per year. They look just like one of the estimates datasets, except that they have only a single numeric population count rather than a time series of estimates. Summary (_sumry) versions of these dat asets have also been created. These sets are named usbridged2kcen and usbridged2k_sumry for the 2k (2000) census and usbridged2010cen and usbridged2010_sumry for the 2010 census.

The Datasets

We currently have seven data sets per state (substitute the state postal code for XX in the set names):

  1. XXnchsbridged19xx: detail data for 1990-1999.

  2. XXnchsbridged19xx_sumry: summary data for 1990-1999.

  3. XXnchsbridged20xx: detail data for 2000-2009. Original, post-censal estimates. (To get detailed inter-censal estimates for this decade use the usnchsbridged200x_intrcnsl data set and filter for the state).

  4. XXnchsbridged20xx_sumry: summary data for 2000-2009. Original, post-censal estimates.

  5. XXnchsbridged20xxi_sumry: summary data for 2000-2009. Inter-censal estimates.

  6. XXnchsbridged201x: detail data for 2010-20yy. Latest estimates, replaced with new version each year.

  7. XXnchsbridged201x_sumry: summary data for 2010-20yy. Summary version of previous set.

Annual Processing

Each year we now download a compressed file from the NCHS web site containing a huge txt file with the estimates for every county in the U.S. over the entire post-2010-census time period (starting with 7-1-10) for which the estimates are available. We run SAS conversion setups to (re)create a pair of datasets per state. (We write over the old data sets, creating new generations of data; estimates published in a previous year may change in a future estimate year, due to challenges or other revisions.) The two data sets are as follows:

  1. Detail data set: A direct transcription of the raw input file. Each observation here represents a set of July 1 estimates starting with 2010 and going through the latest-available year (currently 2011) for a specific county, single year of Age, race, sex and hispanic origin. View the sample listing of the first 200 observations of the Missouri nchsbridged dataset. Note that it starts right out with data for the first county (29001=Adair: the variable stores County as a 5-digit FIPS code but we specified a format code of $county. in the Formats text box in Sec. V.c of the Dexter query form) in the state and has 86 rows/observations with code values of 1 for sex, race and Hispanic. So these 86 rows are estimates for white male non-Hispanics by single years of age. In obs 87 we see the value of Sex is now 2 (female) and we now get 86 rows of estimats for white female non-Hispanics. The dataset has only a few variables: just the 2 geographic (state & county) and 4 demographic ID variables (age, sex, race, hispanic) and the PopJLxx time-series estimates. But it has a great many rows. It is summarized data but the detail is such that it almost resembles microdata. The dataset is rewritten each year and a new estimate (year) is added to the time series, while the time-series variables that were already there get "refreshed".

  2. Summary data set: The second dataset is a direct derivative of the first and contains summaries and restructuring of the raw data. It has the same name as the first dataset but with _sumry appended; so the two datasets for California are canchsbridged201x and canchsbridged201x_sumry. We have placed a sample listing of part of one of our _sumry datasets in the nchsbri directory. The _sumry dataset has fewer rows and more variables than the original dataset. Important distinctions include

Code Values

These files use category codes that you need to know to interpret the data. These include both custom demographic category codes as well as standard FIPs geographic codes. Here are variables, the codes used and their meanings.

Access the Data Via Uexplore/Dexter

Access the data in the /pub/data/popests/nchsbri data directory. It will be much easier to find things if you navigate via the Datasets.html file in this directory.

This file last modified Monday August 05, 2013, 11:44:48

Site Map    |    Our URL    |   
The Missouri Census Data Center is a sponsored program of the Missouri State Library within the office of the Missouri Secretary of State. The MCDC has been a partner in the U.S. Census Bureau's State Data Center program since 1979.

Questions/Comments regarding this page or this web site are strongly encouraged and can be sent to