These files provide some basic population counts from the 2000 decennial census. Total persons and a breakdown of persons by 6 major race categories and by voting age is provided. Hispanic origin counts are also provided, combined with race and voting age.

These files provide very fine geographic detail, with summaries for areas ranging from the entire U.S. to a census block. There are dozens of geographic summary levels provided.

The original files provided from the U.S. Census Bureau had a full set of over 270 data cells for each geographic area. In processing the file the Missouri Census Data Center has attempted to provide more compact and, we hope, easier to use data summaries. Most of the data sets found here are compact summaries data sets in which each observation (row) corresponds to one geographic entity (a census tract or a county, for example) and the variables are a set of about 34 key items gleaned from the over 270 cells on the orginal summary record. In many cases, we have done aggregation to create summaries for geographic entities that were not summarized on the orginal files (for example, Metropolitan Statistical Areas for the U.S. and School Districts in Missouri.)

Geographic Coverage

Important Note: [see just below for follow-up on this note] On 10-27-08 all data from this collection was moved to a 1.6G .zip archive stored in the /pub/data3 directory on the mcdc Unix platform, where it is not directly accessible to the public. We kept only the datasets for Missouri in this directory. If you need to access data for other states, please contact the MCDC. Hence, take the next sentence with a grain of salt.
More important follow-up note: In April of 2011, following the release of the 2010 pl94 data, we decided there would be renewed interest in these data. So we brought back the entire national collection and stored it in the new othrstts subdirectory where you can now easily access data for the entire U.S.

The data covers the entire country. We have a few more geographic units summarized for Missouri, but the basic data down to the census block level is available for all states. Generally, the coverage of each data file varies and is identified by the first two letters of the file name, which is either us or a state postal abbreviation such as mo, ca, il, etc.

Data Set Organization and Naming Conventions

Sets for All States

For each state we have created a series of SAS data sets and views (we do not count index files as SAS data sets.) A view is just like a data set for you, the user. Internally, however, it is not stored as an actual data set, but instead as a series of SAS SQL commands that generate a data set when you reference the view. SAS data sets and data views are referred to as "SAS data files". These data files have names that begin with the state postal code. For example, the data files for the state of Alabama (postal code "al") are Remember -- all these data files are available for all states. Just as there is an altracts data file with data for Alabama, there are also njtracts and catracts data files with data for New Jersey and California, etc.

Missouri Only Data Sets

There are a number of data sets that exist only for Missouri. These are mostly custom aggregations of the data to different geographic units. These sets are name beginning with mo followed by the geographic units summarized within the data set. For example, morpcs.sas7bdat contains summaries for Missouri Regional Planning Commissions, and mobgs150.sas7bdat contains data summarized for census block groups. (The files as released contain summaries for block groups split by several higher level geographies, namely VTD, MCD and place, but many users just want a summary for complete block groups.)

United States Coverage Data Sets

We have extracted and/or aggregated data in the 51 state level data sets to create a series of national data sets. Each of these data sets has a name beginning with us. These sets provide summaries for the entire U.S., its regions and divisions (in the set usregdiv.sas7bdata), for states (usstates), for counties and equivalents (uscntys), metropolitan areas (2000 definitions) (usmetros, and places (cities) with at least 2500 population (usplaces). There is also a data set, uscntychange, that combines the TotPop count from pl942000 with the revised 1990 count and all annual intercensal estimates by year.

And New England Towns

Minor Civil Divisions are known as towns in New England. Because they are used as the building blocks for metro areas there, we created a separate data set with the MCD-level summaries for that entire region. This data file is called NewEnglandMCDs.sas7bdat.

Codebook - Description of the Variables/Columns in Each Data File

Not all variables occur in all data sets, especially the identifier variables. Does not apply to xxdetail datasets.

Geocodes and Other ID Variables

For a more detailed description of (most of) these codes see the Identification Section (within the Data Dictionary, Chapter 7) of the Census Bureau's official technical documention.

Occurs only on selected data views. This was our attempt to create a single character string key that would provide a linkage to Arcview shape files being provided by various sources. These strings are just a set of geographic code values strung together.

A sequentially assigned numeric key that links the data back to the state's geos data file.

Our locally generate concatenation of geographic code separated by dashes that uniquely ID the area. Not as useful as we had hoped. Probably best to ignore it.

Geographic Summary Level code .
SUMLEV Code Summary Level
040 State
050 County (or county equivalent)
060 County subdivision (MCD, township, CCD)
140 Census tract (complete)
155 Place (within county)
160 Place (complete)
500 Congressional District (106th)
610 State Senate District (as of 2000)
620 State House District (as of 2000)
700 VTD (Voting Tabulation District)
740 Block Group, split by VTD, MCD, and Place
750 Census Block   Note: On Summary File 1, block summaries have a SUMLEV code of '101'.
What's the difference? Sort order, we guess. To us, a census block is a census block --
it cannot be split by anything because it is by definition the smallest thing
there is in terms of census geography.

Name of the geographic area being summarized.

FIPS state code.

State postal abbreviation.

5-digit FIPS county code. Will sometimes print as the county name because $county. format code has been permanently assigned to it.

Census tract in format. Leading and trailing zeroes coded. Tract codes are unique within counties and all areas are assigned a tract code.

Logical Record Number. Important when reading the orginal set of 40 files that were used to create these data sets. Not nearly as interesting here, but it does form a link back to the orginal raw data files and could be used as a link variable, though we prefer to use geo_id for that purpose.

U.S. Region

U.S. division.

Census state code. Rarely used any more. Use the FIPS code.

County code (3-digit version) -- See county variable, above, which is 5 characters and includes the state code.

County size code.

County subdivision (MCD) FIPS code. Unique within state. (The census 3-digit code for these entities that was unique within county and was used in earlier censuses is not included on 2000 files.) available.)

County subdivision class code.

County subdivision size code.

Place FIPS code. Unique within state. (The The census 4-digit place code that was used in earlier censuses is not included on 2000 files.) A values of 99999 is sometimes used to indicate a non-place remainder of an area, where "non-place" means not incorporated and not in a census designated place.

placecc Place class code.

placedc Place description code.

Place size code.

Block group. (First digit of block on block level observations.) Block groups are unique within census tracts.

Census Block. A 4-digit code this decade, the first digit being the block group. In 1990 the blocks codes were 3 digits with an optional alpha suffix. In 1970 and 1980 blocks were just 3 digits. This is the smallest geographic level that the census recognizes. Blocks are unique within census tract, which are unique within county.

concit Consolidated city.

Metropolitan Statistical Area/Consolidated MSA. As of Jan. 1, 2000 we think.

Metropolitan Area Size Code.

Two-digit code for CMSA.

Metropolitan area central city indicator.

Primary MSA.

New England County Metropolitan Area code. N.A. outside New England.

Urban Rural code - would be very valuable but it is NOT defined on pl94. (Maybe later we can add it.) For now, the field is always blank.

Congressional District - 106th (1998)

State Leg District Upper Chamber (Senate district in Missouri)

State Leg District Lower Chamber (House district in Missouri)

Voting Tabulation District. How these are assigned and documented varies from state to state and sometimes from county to county. Used almost exclusively in redistricting applications.

VTD indicator to indicator if it is a real or pseudo VTD.

3-digit ZCTA code - not used on pl94.

ZIP Census Tabulation Area - not used on pl94.

Land Area in Square Meters

Water Area Sq Meters

Functional Status Code.

Geographic Change User Note Indicator.

100% Population Count. Should be the same as TotPop on these files.

Internal point north latitude. Accurate to 6 decimal places.

Internal point west longitude (will be negative). Accurate to decimal places.

Legal/Statistical Area Description Code. See
format code for values.

Blank for many summary levels. Where used, values are "P" to indicate an area that is split (e.g. just part of a tract or BG), or "W" to indicate this is the whole area.

Elementary school district (NCES code) This and the other 2 school codes appear on the block level summaries only. To get data by school district you have to sum data at the block level.

Secondary school district (NCES code) NA for Missouri.

Unified school district (NCES code).

Transportation Analysis Zone. This code appears on block level summaries only.

Urban growth area. Only relevant in state of Oregon.

PUMA Code as used on 5% PUMA files. NA on pl94 files.

PUMA Code as used on 1% PUMA files. NA on pl94 files.

Land Area Sq Miles - derived from value in square meters.

Total Area Sq Miles - derived from AreaLand + AreaWatr and converted to square miles.

Tabular Data

Total Population.

White alone : Persons who checked the "White" option and no other in response to the race question.

White alone or in combination : Persons who checked the "White" option in response to the race question (regardless of whether or not they checked any other).
Each of the following pairs of race counts use comparable definitions, i.e. they report persons who checked only that race and persons who checked that race regardless of what else they checked.

Black alone

Black alone or in combination

Amer Indian alone

Amer Indian alone or incombination

Asian alone

Asian alone or in combination

Hawaaian or PI alone

Hawaaian or PI alone or in combination

Some Other race alone

Some other race alone or in combination

White alone, Non Hispanic

White alone or in combination, Non Hispanic

Hispanic Population . This is not a race. On the census questionnaire, there is a question regarding race that does not have hispanic as one of the choices. Instead, there is a separate question asking if you are hispanic. Thus, a person can be both white and hispanic, or both black and hispanic.

Population Aged 18 and over.

White alone, over 18

White alone or in combination, over 18

Black alone, over 18

Black alone or in combination, over 18

AIAN alone, over 18

AIAN alone or in combination, over 18

Asian alone, over 18

Asian alone or in combination, over 18

Hawaaian or PI alone, over 18

Hawaiian or PI alone or in combination, over 18

Other race alone, over 18

Other race alone or in combination, over 18

White Non-hisp alone, over 18

White Non-hisp alone or in combination, over 18

Hispanic Pop Over 18

Multi Racial - Persons who checked more than one race.

Multi Racial Over 18

