Unless otherwise noted, these are all 1990 geographies. A revised version that deals with 2000 and post-2000 geographies is available here.
This help file presents information regarding each of the geographic units ("geocodes") that occur in the MABLE database and that can be accessed by the Geocorr engine. Unless otherwise specified all codes are FIPS (Federal Information Processing Standard).
Unless explicitly stated otherwise, all codes in MABLE are based on the Census Bureau's 1990 ZIP-Block Equivalency file. Thus, geocodes such as ZIPs and places which routinely change (boundaries change, new ones are added and old ones are deleted) will not be current — they will be as of 1990 (or 1991, in the special case of the ZIP codes — for details see the ZIP entry, below.)
The Missouri State Census Data Center and OSEDA maintain a library of geographic code modules in the form of SAS format codes. These modules have special application for SAS software users, since they allow codes to be readily converted to their corresponding names. Sometimes format modules are used not to provide names, but rather to link codes to other entities as a kind of table lookup. Note that although these modules are, technically, "code" you do not have to be a programmer or know any SAS to use these as codebook files to look up a geographic code.
The MABLE database is really a collection of 51 state-level databases. The state geocode is almost always added to output files even if it is not explicitly selected. The District of Columbia is considered a state for the purposes of this application. Other statistically equivalent areas, such as territories and outlying areas (including Puerto Rico) are not states and are thus not part of MABLE. This is a two-digit FIPS (Federal Information Processing Standard) code with leading zeroes.
The FIPS county codes are three-digit numbers assigned within states. They generally are odd numbers assigned in alphabetical order. Exceptions are independent cities (i.e., cities like Baltimore and St. Louis that are not in any county and serve as county equivalents), which are usually assigned codes over 500 (such as 510). On output files and listings we usually combine the FIPS state and county codes. Thus, the value of the County variable for Autauga County, Alabama is 01001 and for Baltimore City, Maryland is 24510. In some states (such as Louisiana and Alaska), the primary substate legal entities are not called counties; but for the sake of this application they are county equivalents and act exactly the same as counties. Counties appearing here are those defined at the time of the 2000 census. See the Census Bureau web page describing any changes that may occur. There are approximately 3141 counties in the U.S.
These are the primary geographic units recognized by the Census Bureau which are just below the county level. Most states have Minor Civil Divisions (MCDs), which are legally recognized governmental or administrative units. MCDs are defined in 28 states and in D.C. In the remaining states, the Census Bureau has defined Census County Divisions (CCDs). Most states have either all MCDs or all CCDs (with Missouri being an example of a state that has both). "MCD" is a generic category; the specific types of MCDs vary by state. The most common type of MCD is the township. Other types of areas that can be MCDs include towns or incorporated places, election districts, plantations, magisterial districts, etc. In the geographic hierarchy, these divisions provide a complete coverage of all counties in the county. There were approximately 35,000 such geographic areas in the U.S. at the time of the 1990 census.
The Smcdcnvt format module shows the relationship between the Census Bureau codes used for these entities and the FIPS codes (as used in MABLE). The FIPS codes are five digits and are unique within state, whereas the Census Bureau codes are only three digits and are unique within county. The Census Bureau codes are no longer used for current data, and we are not even sure if such codes are assigned for new county subdivisions. They are of interest only when needing to link to earlier data that used these codes.
On output files and listings generated by Geocorr, this variable goes by the name cousubfp ("COUnty SUBdivision FiPs").
Lots of variety for this geographic level. Places have different names in different states (e.g., "cities", "towns", "boroughs", "villages", etc.) There are also approximately 4,000 entities called Census Designated Places (CDPs) which have no formal, legally recognized boundaries but which the Census Bureau has designated as areas which are generally recognized by the local population as worthy of having data tabulated for them. Places are within states, but otherwise they can cross just about any other boundary. A place can be in multiple counties, in multiple county subdivisions, ZCTAs, etc. Places are mutually exclusive but are not exhaustive — there are areas that are not contained in any place. The Census Bureau (and MABLE) assign a code of all "9"s to areas that are not within any place. These "pseudo-places" are simply referred to as Unincorporated Remainders; they are not simply "unincorporated portions" (in general) because in many cases part of the unincorporated area of a county or MCD is in a CDP.
The relationship of MCD-CCDs (i.e., county subdivisions) and places varies from state to state, but in general places may cross MCD-CCD boundaries. There are some places which are also MCDs (common in New England). In these cases, the FIPS MCD and place codes are the same (but not the census codes). There were approximately 25,000 places recognized for the 1990 census.
Note: Places are among the most unstable of geographic entities over time, and are perhaps the most difficult to identify accurately since their boundaries are often "invisible" — i.e., do not follow physical features that are easily identifiable. Because of this, the Census Bureau has a real challenge trying to keep up with accurate place definitions. The codes used in MABLE come from the Bureau's official SF1 geographic headers files, which define the geography of the United States as of 1990. The place codes that appear in MABLE reflect what the Bureau recognized for city boundaries when it tabulated the 1990 census. It is an accepted fact of the census-taking business that there will always be mistakes regarding these boundaries. The Bureau has published a special file called the CQR (for "Census Quality Review") that is the official list of known geographic coding problems.
On output files and listings generated by Geocorr this variable goes by the name placefp. The Splccnvt format table shows the relationship between the Census Bureau codes used for these entities and the FIPS codes (as used in MABLE). The FIPS codes are five digits and are unique within state, while the Census Bureau codes are four digits and are also unqiue within state.
The census tract is part of the very useful four-level hierarchy of census data, in which each lower level is completely contained within its parent level. The four levels are county, tract, block group, and block.
The Census Bureau has complete control over these "small-area" geographic units. The Bureau defines them solely for the purpose of collecting and tabulating the results of the census. In most metropolitan areas, local census tract committees are appointed which are responsible for drawing up suggested boundaries for the census tracts in their areas. In most rural areas, there can be (but usually are not) such committees, and the Census Bureau defines the tracts. In the 1990 census, areas that were defined without the input of a local tract committee were called Block Numbering Areas (BNAs). Prior to the 1990 census, this coverage was not complete, i.e., for many areas in 1980, there simply were no census tracts or BNAs. Where tracts have existed for several decades, there is frequently a correspondence between the codes used between one census year and the previous decade. Most census tract committees make efforts to keep tract boundaries consistent over time to facilitate time trend analysis. But when major development takes place this is not always possible. For the sake of this discussion we'll refer to these areas as simply "tracts".
Among the criteria that the Census Bureau has established for defining tracts is that they should be compact contiguous areas with populations of about 4,000 persons and that the area should, if possible, be homogeneous. The ideal urban census tract would be a locally recognized "neighborhood" within a city.
Census tracts are assigned four-digit numeric codes, unique within counties. Tracts can also have a two-digit suffix code, usually indicating that this is a "split" of a tract from an earlier census year. Thus if 1234.00 was a tract in 1900 with 5,000 persons and that area grew to a population of 12,000 by 2000, you might see three tracts in 2000 with codes 1234.01, 1234.02, and 1234.03. Suffix codes of .97 and .98 are special and have to do with details most people would rather not be bothered with. The short explanation is that it represents where there was a "temporary problem" with a tract assignment that was "fixed," but this suffix code had to be attached. Suffix codes of .99 are used for pseudo-tracts used to tabulate "crews of vessels" residing in nearby rivers and lakes. BNAs can be distinguished from census tracts by the first digit: If it's a 9 then it's a BNA, otherwise it's a census tract.
Census tract/BNA codes on all output files and reports form Geocorr are named tract and are always represented in a full seven-character xxxx.xx format with leading and trailing zeroes. There were about 62,000 of these entities defined for the 1990 census, about 11,000 of them classified as BNAs. (The Bureau has announced plans to do away with the distinction between tracts and BNAs staring with the 2000 census.)
Whenever you select this geocode (using Geocorr) from either the source or target geocode select lists, the county code is also automatically selected for you — you should never process tract data without carrying along the county code, unless, of course, your entire analysis is taking place within a single county.
If you understand census tracts, then all you need to know is that block groups are the next level down in the hierarchy. A typical census tract will be comprised of about five or six block groups. The name comes from the fact that each block group (BG) is composed of census blocks, grouped together within a tract. The first digit of the four-digit block number is the code for the area. Thus the block group geocode is only one character long, which, of course, is meaningless outside of the context of the tract and county. Block groups with a code of 0 indicate a coastal area entirely comprised of water. There were approximately 230,000 block groups defined for the 1990 census.
From a data perspective, block groups have the distinction of being the smallest geographic unit (well, almost) for which the Census Bureau tabulates detailed demographic data. This means that if you are looking for data regarding income, occupation, or education (to name three popular subjects only available in the sample data) then the smallest geographic unit for which you'll be able to get that data is the block group.
On all Geocorr output files this geocode will be called bg and will be a single character (digit) long.
Block groups do not have names associated with them, in general, and there are no format codes available for them.
This is the atom in the MABLE view of the matter. It is the smallest geographic entity recognized by the Census Bureau. It is generally the smallest area that can be formed by intersecting visible features. The classic census block is the rectangular city block bounded by four streets. With the extending of block assignment to rural areas for the 1990 census, we also now have the 100-square-miles-of-open-desert blocks and the classic single farm or portion of farm block. Each of the nearly 7 million observations on the MABLE database describes one census block. All other geographies are defined in terms of which of these blocks can be added together to form it. This is a fudge in some cases — notably with ZIP (ZCTA) codes, but makes perfect sense for most geographies. This is because the Census Bureau made a decision when it redesigned census block geography for the 1990 census that it would not have any blocks that crossed county subdivision or place boundaries. The Geocorr concept is based on the idea that geography can be reduced to a set of block-level "pixels" that can be used to examine how other geographies are related using simple algebra instead of complex geometry.
Census blocks are the last level of the county-tract-bg-block hierarchy. The block code itself always has 3 digits and may have an alpha suffix (e.g., 301A). The first digit is never zero and is the same as the BG code (all the blocks with the same first digit with a tract are, by definition, a block group whose code is that first digit).
Water blocks are not included in the MABLE database. These are areas that are made up entirely of water — usually lakes or other inland water features, or portions of rivers or oceans following the shore of any area. These entities (which some have claimed are misnamed since they are actually not "blocks" at all, but a different entity altogether that just look like blocks) do not appear on the ZIP-block equivalency file upon which MABLE is based, nor in any of the other tabulation or header files released by then Bureau. Codes for them do occur in the TIGER geographic base files, however. There they are assinged codes ending with 99a, where "a" is an alpha suffix. The first character is the BG number. In a given BG, if there are four distinct bodies of water they might be assigned codes of g99A, g99B, g99C and g99D, where "g" is the BG number. The omission of these areas from any of the Bureau's block files creates a hole in MABLE for some potential applications, especially those that might deal with relating environmental data with demographic. It will be important for the future development of this application to find a way to get these areas incorporated into MABLE.
On all output files and listing produced by Geocorr, the census block geocode is called blk and is four characters long. Census blocks do not have names associated with them.
For the 2000 Census, the Census Bureau decided to change the name of the geographic entity that they had previously referred to as ZIP codes. They would now be called ZCTAs (ZIP Census Tabulation Areas). There were actually rather minor changes in the way these entities were defined as compared to 1990, but the Bureau decided that it would be helpful to alert the users to the fact that these entities were not exactly ZIP codes. But, for most purposes, pretty close. To see the definitive word on the what, why, and how of ZCTAs see the Bureau's ZCTA web page or the Geographic Terms and Concepts - ZCTAs web page.
Note: From here on down, we are talking about the ZIP codes used for the 1990 Census and the original 1990 MABLE database. But most of what is said about them applies as well to the 2000 ZCTA codes, which replace them on the 2000 MABLE database.
These are among the most useful and in some ways most inaccurate fields on the MABLE database. Its source, unlike most of the other geographies on the database, is not the Census Bureau. Not directly, at least. The Bureau contracted out to a private vendor to have a 1990 census block to then-current ZIP code file created for them. This project was carried out in 1991 and the approximate data of the ZIP codes on the file is October of 1991. Of course, ZIP codes are famous for changing: new ones get created and sometimes (rarely) old ones disappear. Most importantly, ZIP codes change their definitions, but not their codes. So working with ZIP codes over time is always a problem.
Only "residential" ZIP codes — those containing household addresses — have corresponding ZCTA codes (and hence will appear in a MABLE database). There are no business or Post Office Box-only ZCTAs, etc. The latter account for about a fourth of all ZIP codes in the U.S.
Another problem is that ZIP codes are not really spatial entities — they are simply lists of addresses, organized to facilitate mail delivery. While they often do form areas that can be viewed as geographic areas, that is not what they really are. This can create problems when you try to relate them to a spatial entity such as a census block. Think of a classic census block formed by the intersection of 1st St., Elm Ave, 2nd St., and Pine Ave. If 1st St. is the northern border of the block, then folks living on the south side of 1st St. between Elm and Pine are in our block (let's call it 101). But people living across the street — on the north side of 1st St. — are living in a different block, say 102. But the U.S. Postal Service would never have a ZIP boundary go down the middle of a street. If this were an area where the ZIP changed, it would almost certainly divide along (vague and invisible) "back-lot lines." For example, the folks living on both sides of 1st St. in our example might live in ZIP 12345, while the folks living on 2nd St. might live in 12346. Thus you have households in the same census block, but in different ZIP codes. Hence, the fundamental concept of census block as the atomic unit is violated. Of course, this only happens in a certain percentage of blocks, and in many cases the ZIP boundaries are on commercial streets where not many people live and you can assign most of the population in the boundary blocks to the right ZIP.
So the vendor's job, per the Bureau's specifications, was to assign each block to a single "best" ZIP code. When this equivalency file was delivered to the Bureau they used it to create a special summary tabulation of the 1990 census data called STF3B ("Summary Tape File 3, subfile B"). This was the only census product that let you get income, education, or even basic populations count data for ZIP codes. It was an estimate based on estimating what blocks made up what ZIPs. The assumption was that most applications that involve ZIP codes don't have to be absolutely exact, just close. And that is what we have here with Geocorr — these correspondences are never going to be exact or perfect. But they'll be good enough for very many applications.
For more information about ZIP codes, see the MCDC's own ZIP Code resources page.
This format code was derived from a file from the U.S. Postal Service. It's a combination of post office and local geographic names. It is the source for the ZIPNAME fields that will be added to your Geocorr outputs, if you specify that you want names to got with your geocodes and you also select ZIP as one of your geocodes.
This is not exactly a geocode in the same sense as the other ones used here, but it is far too useful to discard on a technicality. All census blocks are assigned this characteristic by the Census Bureau based on their standard definition of the concept of urban: All population and territory within the boundaries of UAs and the urban portion of places outside of of UAs that have a decennial census population of 2,500 or more. Of course, this brings up the question of what is the definition of a UA and, indeed, what does "UA" even stand for? It stands for "Urbanized Area" and its definition (short form) is: An area consisting of a central place(s) and adjacent urban fringe that together have a minimum residential population of at least 50,000 people and generally an overall population density of 1,000 people per square mile of land area. The Census Bureau uses published criteria to determine the qualification and boundaries of UAs.
If someone asks you what it means the best answer is just to say "A place is urban if it's in a major city or the suburbs of a major city, or in a town of 2,500 or more". That's not exactly true, but it's much simpler and it's very close. If you would like to get the "official" definition of urban/rural, it is available at the Census Bureau's web site.
This geocode has two values: 1 means urban and 2 means rural. On Geocorr output files, this field is called urbanrur and is one character long.
The Office of Management and Budget actually defines these metropolitan area entities based on decennial census data. The code used here was the definition that was in effect at the time of the 1990 census (not the one that was assigned a year or two later based on the finding of that census, nor the one that are in effect today, since OMB modifies these definitions annually with most changes occurring right after a decennial census.) MSA stands for Metropolitan Statistical Area. Most metropolitan areas (like St. Louis, Pittsburgh, Des Moines, and Louisville) are simple MSAs. Consolidated MSAs (CMSAs) occur when two or more MSAs are joined, or when there is a large urban area with more than one central city. Examples of CMSAs are Chicago-Gary-SE Wisconsin, Washington,DC-Baltimore, Dallas-Ft. Worth, and the Bay Area. CMSAs are then broken down into PMSAs, which are also available. This more complex system of classifying metro areas replaced the simpler SMSA concept that was used in earlier censuses. Note that codes for metro areas are unique without any qualifiers, and that metro areas can span states.
Except in New England, metro areas are made up of complete counties. In New England they are made up of complete towns (MCDs.)
The Census Bureau has created an excellent web page describing Metro Area concepts with links to current codes and geographic components.
This format code is actually a little later than the 1990 definition and includes some other kinds of metros but it should provides codes and names for all the codes on MABLE.
On Geocorr output files, this field will be called msacmsa and will be four characters wide. A value of 9999 is used to indicate an area that is not within a metro area.
Most of what was said for the previous geocode — MSA/CMSA applies here as well. Except that PMSA will have a value of 9999 for any area that is not inside a PMSA. Even if it is metropolitan, if it is not in a CMSA (and hence a PMSA) then it has the all nines code. (See also the Bureau's metro area web page. )
This format code handles MSA, CMSA, and PMSA codes and returns the names of the areas. Note that these three kinds of codes do not overlap (i.e., if there is an MSA with code 1234, then there will never be a PMSA or CMSA with that code). The format code will return a (P) at the end of the metro name to indicate a Primary MSA.
On Geocorr output files, this field will be called pmsa and will be four characters wide. It will have a value of 9999 to indicate not applicable.
As the name suggests, these areas are defined only within the six-state New England region. NECMAs are used as alternative metro areas by those wanting to be able to aggregate county-level data to metro areas. (The standard MAs in New England do not follow county boundaries, unlike the reset of the country.)
We referred to Urbanized Areas above in our discussion of Urban/Rural. The details of the definition of an Urbanized Area are complicated. It differs from a metro area, which is a more "rounded off" definition of a metropolitan area. The idea behind the UA is to distinguish between the part that is densely settled and contiguous to the central city.
On all Geocorr output files this field is called urbarea and is four characters wide.
The U.S. Congressional Districts as defined at the time of the 1990 census, i.e. before the major redistricting of 1991 (based on the results of the 1990 census.) It's a two-digit code. In the seven states where there is only a single CD, the code is 00. Otherwise values start with 01 through the number of seats for the state. Like all other geocodes in MABLE, it is a character string with leading zeroes.
On all Geocorr output files, this field is called cd102 and is two characters wide.
This is a field that was not part of the ZIP-Block equivalency file but was added using another equivalency file provided by the Census Bureau. These are the codes reflecting redistricting in 1991 and which were used in the 1992 elections. Like the cd102 field, it will have a value of 00 for states where there is a single district.
On all Geocorr output files, this field will be called cd103 and will be two characters wide.
The smallest geographic area identified on the 1990 Public Use Micro Sample files (File A — 5% sample). Boundaries of PUMA areas had to be defined in terms of counties, places, county subdivisions or census tracts. In a very large majority of cases PUMAs consist of one or more counties. In larger metro counties, they are frequently broken down along the smaller geographic area lines (places and/or census tracts). A strict guideline for defining PUMAs is that they must have a minimum population of 100,000 persons (as of the 1990 census). The Census Bureau has distributed several products in an attempt to define the boundaries of these entities, none of which are complete and which in many cases obscure the fairly simple nature of the PUMA assignment, especially in metropolitan areas. Some detective work was required to fill in the holes in the Bureau's sources. For a more detailed description of these entities with links to related resources, see the MCDC's PUMAs page.
PUMA codes are five digits (characters) long. Most end with 00. Generally when the last two digits are not zeroes, it represents a county that has been split into subareas. Thus, for example, the PUMA codes for City of St. Louis are 01201, 01202, and 01203.
On all Geocorr output files these fields will be called puma5 and will be five characters wide with leading and trailing zeroes. There will be no names associated with them.
Basically just a twin of the PUMA codes of the A sample. In many cases the PUMA codes used for the A and B samples were the same, or had only minor differences within a state. A general rule of thumb is that B PUMAs are more likely to be defined so that they can be used to identify metropolitan areas.
On all Geocorr output files these codes will be called bpuma and will be five characters wide with leading and trailing zeroes. The bpumas have no names associated with them.
Unlike the data files, the designation of "99" areas (those B PUMAs which span across state lines) do not occurr in MABLE. Every block has a geocode of state, thus the bpuma are defined as B PUMAs within state.
This data layer is based on the Hydrologic Unit Maps published by the U.S. Geological Survey Office of Water Data Coordination, together with the list descriptions and name of region, subregion, accounting units, and cataloging unit. The hydrologic units are encoded with an eight-digit number that indicates the hydrologic region (first two digits), hydrologic subregion (second two digits), accounting unit (third two digits), and cataloging unit (fourth two digits). More information is located at the USGS.
Users should be forewarned that the assignments of census blocks to hydrologic unit codes involves a certain amount of allocation. First of all, the geographies albeit close in scale, come from different sources. Secondly, a point in polygon routine was performed assigning census blocks to hydrologic unit (in reality the boundaries of both layers may overlap). Finally, the hydrologic units, unlike political boundaries that are specified with high precision, are very scale sensitive and may not match where you think they should — particularly in urban areas where drainage is artificially controlled.
For more information regarding the hydrologic unit products, consult the HUC Products Page of the USGS. Specifics of how we assigned certain problem areas are explained in the Geocorr 1990 usage notes page.
For Missouri watersheds the Center for Agricultural, Resource and Environmental Systems (CARES) has created base maps for the 8-digit Hydrological units of Missouri, which can be viewed from their web site.
County-to-county flows of commuters were analyzed with a hierarchical cluster algorithm. The results of the cluster analysis were used to identify commuting zones (CZs) or groups of counties with strong commuting ties. For 1990, 741 commuting zones were delineated for all U.S. counties and county equivalents. These commuting zones are intended for use as spatial proxies for local labor markets when researchers are not concerned with minimum population thresholds. Where necessary, the commuting zones were then aggregated in to 394 labor market areas (LMAs) that met the Bureau of the Census' criterion of a 100,000 population minimum.
The 1990 commuting zone code's first three digits indicate labor market areas with digits four and five indicating commuting zone (sequential numbering starting with 00).
This research replicated a previous delineation of U.S. 1980 commuting zones and labor market areas and is performed at the Louisiana Population Data Center (under the guidance of Charles M. Tolbert and Molly Sizer Killian). For more information, please consult the references and publications listed at this site and/or at http://www.lapop.lsu.edu/ftp.html (see the line labeled Master county equivalency file), where you can download the master county to LMA/CZ equivalency file for the entire U.S.
Metro counties (Metro counties are not classified in the ERS county typology):
Source: Economic Research Service, USDA. The Beale Codes were also obtained from the Louisiana Population Data Center.
For additional background information on any of the geographic units maintained or utilized by the Census Bureau you can view the Geographic Areas Reference Manual which is now available on the Geography Divsion's home page.
Other useful links: