The ZIP Code Resources Page

(AKA "All about ... ZIP Codes")

Tools and Resources Related to U.S. ZIP Codes

Version 2.3, November 2010

About This Document

This document is not entirely current. It was originally created circa 2002 as an update to the orginal version (created in the early 90s and referenced below) so that it dealt with ZIP codes and 2000 ZCTAs (i.e. the ZCTAs defined by the Census Bureau for tabulating the results of the 2000 census). In this document we point to quite a few web pages at the Census Bureau and they are (in)famous for moving things around and/or letting them disappear. So there are some broken links here. More importantly, there are links that work but are still dealing with ZCTAs as they were defined for the first decade of this century. While we do not have an entire new replacement page we do have a new page that deals solely with stuff that is new and different for the decade 2010-2019. You can read this document for basic background info, but when you are looking for the latest data at the ZIP/ZCTA level or accessing data sets that show how these codes currently can be related to other geographies, then you want to look at the page we describe next.

Supplemental Document Deals with What's New for 2010 and Beyond

A new document has been created to supplement this one with information regarding the current status of 2010 vintage ZIP codes and ZCTA's. It is probably the page you really want, and it can be accessed at: http://mcdc.missouri.edu/pub/allabout/zipcodes_2010supplement.shtml


This page serves as a kind of FAQ (but without the Q and A format) for researchers interested in  U.S. postal geography codes.  It provides some general background information about ZIP codes.  It describes and links to various  reference materials, data files, web sites and other resources of interest to users wanting to work with   ZIPs.  This is Version 2.3, a revision that features an updated description of the ZIP Code Master Dataset (at federgovernmentzipcodes.us - not to be confused with our own ZCTA_Master data set). Version 2.0 a major revision of the original page, which was created in the mid to late 90's and dealt primarily with the codes as they were defined for use in the 1990 census.  This version focuses on the Census Bureau's 2000 vintage  ZIP Census Tabulation Areas (ZCTAs), close cousins of ZIP codes.   As in the earlier version, we place special emphasis on tools for linking ZIP (/ZCTA) codes to other geographies (such as counties, cities, metro areas) and to demographic information from the latest decennial census.  


General Information   |     ZCTAs /   zcta_master data set)  |   ZIP Code Master Dataset and State Directories  |  MABLE/Geocorr  |   ZIP Codes (with suffixes) linked to Metro Areas |  Demographic Data/Profiles  |  Links to Other Pages

General Information About ZIP Codes

Problems With Spatial Definition

ZIP codes are a very messy kind of geography. They were created by the U.S. Postal Service as a tool to help deliver the mail more efficiently. ("ZIP" is actually an acronym for "Zone Improvement Plan", where "Zone" is  a reference to the 2-digit postal zones that were used by the post office prior to implementing nationwide ZIP codes back in the early 1960's. Because it is an acronym we always use the uppercase for it.) ZIP codes have been adopted by marketing people and by all kinds of other researchers as a standard geographic area, like a city or a county. We see maps of ZIP codes in telephone books and from commercial vendors that make us think of them as spatially defined areas with precise boundaries, similar to counties. But, from the perspective of the agency that defines them, the U.S. Postal Service,  ZIP codes are not and never have been such spatial entities. They are simply categories for grouping mailing addresses. As such, ZIP codes do in most cases resemble spatial areas since they are comprised of spatially clustered street ranges. But not always. In rural areas, ZIP codes can be collections of lines (rural delivery routes) that in reality do no look much like a closed spatial area. In areas where there is no mail delivery (deserts, mountains, lakes, much of Nevada and Utah) ZIP codes are not really defined. You may see maps that show ZIP code boundaries that include such areas, but these are not post-office-defined official definitions. An area will not be assigned a ZIP code until there is a reason for it, i.e. until there needs to be mail delivered there. So the actual definition of a ZIP code "boundary" is quite fuzzy at best, and a purely extrapolated guess (at what it would be if someone were to start receiving mail there) at worst. If you have an application that requires extreme geographic precision, especially in sparsely populated areas, then you need to avoid using ZIP codes.

How ZIP Codes Change Over Time

An important thing to keep in mind about ZIP codes is that they change over time. In some cases these changes can be quite dramatic, but more commonly they are small and subtle. When a ZIP codes changes its definition it does not change its name like a census tract. The ZIP code that was called '63301' in St. Charles county, Mo in 1985 was subsequently broken into first two and then three ZIP codes. These new codes were not called 63301.01, 63301.02 and 63301.03; they were called 63301, 63303 and 63304. So what is referred to as 63301 today represents about a third of the area that it referred to in 1985. The new code 63303 did not exist in 1985 and it has already changed its definition so that it now represents about half of the area it included when it was initially created (by splitting 63301 into 63301 and 63303; a few years later the initial 63303 ZIP was subdivided into 63303 and 63304.). What this means, of course, is that ZIP codes are really terrible units for doing any kind of time-series analysis unless you have some way of keeping track of all the changes over time. Otherwise, you may wind up concluding that there has been a dramatic downward trend in the population of 63301 since 1980, when in fact just the opposite is true.  At least when you attempt a time-series study of 63304 it becomes apparent that this geographic entity did not exist before 1990.  

What the world really needs to deal with ZIP code geography properly is a large geographic equivalency file relating ZIP codes to other relevant geographies with a time dimension. Instead, what we have is a pair of such equivalency files that relate ZIP codes to geographic entities as used for tabulating the 1990 and 2000 decennial censuses. The 1990 file uses ZIP codes as they were defined around July of 1991. (Because it takes a long time to do the research, it may be that the currency of the ZIP codes used varied somewhat from area to area.)   In 2000 the Census Bureau had decided to adopt the new things they called ZIP Census Tabulation Areas (ZCTAs), which were really very similar to what they called ZIP codes for tabulating earlier censuses.  The equivalency files we we are referring to here are  the MABLE databases and  corresponding Geocorr web applications  which we'll be talking about in more detail below. For now, what we want to emphasize is that when we talk about ZIP codes (or ZCTAs) we really need to keep a time reference in mind. Just as when you work with census tracts you need to know whether you mean 1980 or 1990 tracts, or when you are talking about the countries of Europe -- time is an important dimension.

You might think that what we should always assume is that if we do not specify a time, then we are probably referring to the most current definitions of ZIP codes, and that any reference materials should be periodically updated to reflect these definitions. Easier said than done, of course. This would be a huge task. But even if you could maintain all your lists with the latest definitions in some cases there are reasons why it may be preferable not to. This has to do with the fact that the Census Bureau tabulated the results of the last two decennial censuses to produce summary files describing these entities as they were defined at a certain point in time.  The data tables on these files describe the characteristics of the residential ZIP codes as they were (more or less) defined at the time the census  tabulations were prepared (circa July, 1991 for the 1990 census; Jan. 1, 2000 for the 2000 census.)  "More or less"? Yes - for the purposes of creating these tabulations in 1990 the Bureau had private vendors provide them with files that related each of the 1990 census blocks (the smallest geographic unit identified for each 1990 census return) with the then-current ZIP code definitions.  The files created by these vendors were used to create a data product called the "ZIP Block Equivalency Files" or "STF3B Headers Files". They define which geographic areas were used to approximate the ZIP code areas being summarized by the STF3B data tables. There is a built in "fuzz factor" in this equivalency list since - while the Census Bureau has created census blocks so that they do not cross any other census-defined geographic unit - blocks can and do (frequently) cross ZIP codes. Typically, ZIP code "boundaries" fall along back lot lines - they almost never split down the middle of a street. If they did, you would need to have two postal carriers - one from each of the two ZIP codes, travel the same street and deliver just to their side. Census blocks, however, almost always split down the middle of streets. As a result, blocks near the boundaries of ZIP codes typically split ZIP codes.  If you picture the classic rectangular city census block, it is an area bounded by portions of 4 city streets.  Each of those street faces has its own street name, address range and ZIP code, and it is quite common for the ZIP codes for the 4 streets to not all be the same.  When this happens, the Census Bureau (or its vendor agents in 1990) assigns the entire block to a single ZIP/ZCTA as used for tabulating the census.  So a city block might have 12 households with 40 persons in a census block  living in ZIP A and 6 households with 15 persons living in ZIP B; for the sake of doing the census tabulation, all 18 households and 55 persons will be tabulated as part of ZIP A.  We sometimes describe this phenomenon as "rounding off the ZIP data to blocks".  These rounding errors are unbiased and may cancel each other out to some extent, but they are still an important source of potential error.   In the block to ZIP equivalency files prepared by the vendors in 1990 and assigned by the Census Bureau in 2000, each census block was assigned to one and only one ZIP or ZCTA.  The results of these assignments were used in creating the Master Area Block Level Equivalency ("MABLE") files used in the Geocorr web applications.  
 


Most of the various files that we reference from this page will be dealing with ZIP/ZCTA codes as they were defined for the purposes of tabulating the 2000 census. In other words, the "default time" for ZIP codes for the sake of this document is approximately Jan. 1, 200.  In our previous version of this page, we focused our attention on the ZIP codes as defined for use in the 1990 census.  

Failure to Conform to Other Geographic Schemes

Another important and exasperating characteristic of ZIP codes is that they do not conform to any other geographic schemes. Most geographic units are part of some hierarchical system, and frequently they will recognize other boundaries such as counties or states. But ZIP codes follow no rules whatsoever with respect to other geographies. ZIP codes can and do cross state lines (rarely, but just enough to cause some problems and confusion), county lines (about 10% of ZIPs are in more than one county), political jurisdictions (cities, congressional districts), metro areas, etc.

This aspect of ZIPs (specifically as defined for the 1990 census) and several other useful bits of information about them are discussed in the MAGGOT (Master Area Geographic Glossary of Terms) file provided with MABLE/Geocorr.


ZCTAs

ZCTA's (ZIP Census Tabulation Areas) are what the U.S. Census Bureau is now (as of 2000) using as an alternative to ZIP codes as geographic entities for publishing data based on actual ZIP codes. A 5-digit ZCTA (there are 3-digit ZCTAs as well) is typically nearly identical to a 5-digit U.S.P.S. ZIP code, but there are important distinctions. The Census Bureau has created a web site where they explain the differences (among other things.)   Here is an excerpt from that site that nicely summarizes the major things to keep in mind when working with ZCTA's and pretending that you are working with ZIP codes:
It is important to note the following:
  • In most instances the ZCTA code equals the ZIP Code for an area
  • In creating ZCTAs, the Census Bureau took the ZIP Code used by the majority of addresses in a area for the ZCTA code; some addresses will end up with a ZCTA code different from their ZIP Code.
  • Some ZIP Codes represent very few addresses (sometimes only one) and therefore will not appear in the ZCTA universe.
  • The term ZCTA was created to differentiate between this entity and true USPS ZIP Codes.
    Information on the Census Bureau's position regarding ZIP Code data.
  • ZCTA is a trademark of the U.S. Census Bureau; ZIP Code is a registered trademark of the U.S. Postal Service.
  • The Census Bureau does not have U.S. Postal Service ZIP Code boundary files, nor do we have information or possible sources of such files.
  • Census Bureau data sets tabulated by ZIP Code are listed on the ZIP Code Statistics page.
Note that ZCTA's are new for 2000; there are, strictly speaking, no historical ZCTA's for doing any time-series analysis. The Bureau  published results of the 2000 Census aggregated to these geographic units on Summary Files 1 and 3. Unlike the ZIP codes used for tabulating earlier censuses, these ZCTA areas are spatially complete and you can easily do mapping with them. You can download ZCTA boundary files from the Census Bureau's Cartographic Boundary Files web site.  

Special ZCTAs: XX and HH

The Bureau has created special XX ZCTA's (ZCTA's with a valid 3-digit ZIP but with "XX" as last 2 characters of the code - such as "631XX") which represent large unpopulated areas where it made no sense to assign a census block to an actual ZIP code. Similarly, HH ZCTA's  such as 633HH (the H stands for Hydrography, we assume) represent large bodies of water within or bordering a 3-digit ZIP area. There are typically no persons or households in an XX or HH ZCTA.    Applications that use ZCTA codes for population-based applications (as opposed to spatial based) can generally ignore these special ZCTAs. 

The MCDC's georef.zcta_master Data set

You can think of the zcta_master data set as a very large table with over 33,000 rows and 60 columns. Each row corresponds to one State/ZCTA combination. The columns describe the ZCTA's in several ways. To see a complete detailed listing and explanation of these columns (or "variables") see the
metadata page for this data set. Most of the columns provide information about what geographic entities intersect with the ZIP (ZCTA). Here is a display of one row (observation) from the zcta_master data set, showing the entry for ZIP/ZCTA 65201 (basically, downtown Columbia, Mo).


This output is a pdf file generated using the Dexter web extraction program that involved only 3 simple parameter entries on our part. We specified that we wanted our output in pdf format; we specified a "filter" of the form:
zcta5     Equal To (=)    65201 

; and we specified that we wanted ZCTA5 to be used as the "ID variable" to appear as the first item in each row of the display.

What can you say about ZIP 65201 based upon these data? You can say that it it is primarily in Boone County, MO (FIPS code 29019) and secondarily in Callaway county (FIPS code 29027). The pctcnty variable tells you that 99.8% of the people who lived in this ZCTA at the time of the 2000 census also lived in the primary county (Boone.) We can also see that the ZIP is mostly (78.4%) in the city ("place" using the census terminology) of Columbia, which has a FIPS code 15670 (variable PlaceFP). The variables PlaceFP2 and pctplace2 tell us that the remainder of the ZIP (i.e. the part that is not within the city of Columbia) is in an unincorporated area.

Another series of codes identifies various metropolitan and urbanized areas associated with the ZIP. We see that 65201 has 89% of its population living in the Columbia, MO urbanized area and 99.8% in the Columbia MSA (both the old 2000 msacmsa version and the current core-based version, cbsa).

We have a number of spatial data items associated with the ZCTA; we have land area in square miles, the "internal point" coordinates for the ZCTA as published with the 2000 census data, and a custom coordinate pair obtained by taking the population-weighted averages of the internal pt coordinates of all the census blocks that were within the ZIP (stored as popcentrLat, popcentrLon). And finally we have a series of key demographic and economic status indicators taken from the 2000 census tables: total population and housing units, median household income, mean poverty ratio and average housing value. Each of the last 3 have the actual value as well as a corresponding index value that tells you how it compares to the value for the entire country. We see that 65201 has (had) a Median Household Income value of $26,955 and that this was only 64.2% of the value for the U.S. as a whole. This reflects the fact that this area, which includes at least portions of 3 university campuses, is heavily inhabited by college students - who typically have lower incomes and smaller households. Note that the other 2 economic measures also indicate an area that is below average in terms of economic well being, but that those measures have indices in the low 80's which would indicate the area is not really a slum.

There is even an appsurl variable that contains a ZIP-specific web address that can be followed in order to obtain additional data regarding the ZIP.
Accessing the ZCTA_Master Data set
Go to http://mcdc.missouri.edu/applications/uexplore.shtml - the Uexplore home page/directory. Choose Geography/GIS as the major data category, and then choose georef as the filetype (major data directory). This takes you to uexplore referencing our geographic reference data sets collection - http://mcdc.missouri.edu/cgi-bin/uexplore?/pub/data/georef . Click on the Datasets.html file to get a directory page with metadata and links to the subcollection. The zcta_master data set is the second one listed (this could easily change in the future -- just scan the first column for the name). Click on its name to select it and to invoke Dexter. Follow the link at the top of the Dexter query form to "detailed metadata" (and from that page there is another link, near the bottom, to a separate page where we go into more detail about each variable - http://mcdc.missouri.edu/pub/data/georef/zcta_master.Metadata.html . If you are new to Dexter you can follow the link to the Dexter Quick Start Guide at the very top of the query form. That will take you through the basics of how Dexter works. But if you just want to grab all rows and all variables in csv format just go to Sectionn III. Choose Variables and click the box that says you want to keep all the variables. Then find one of the "Extract Data" buttons and click it. The result takes a half minute or so to generates a csv file that is quite large - 15 meg. We have stored a permanent copy of this csv file in the georef directory and you can access it directly as http://mcdc.missouri.edu/pub/data/georef/zcta_master.csv.

Note: if you have mastered the fine points of the zcta_master file you can probably skip the next topics, since you can easily derive the same information using zcta_master.

Relating ZCTAs to Counties

One of the most frequently asked questions regarding ZIP codes (ZCTAs) is "If I know the ZIP code of my {customer/survey respondent/perpetrator} how can I tell what state and county they live in?" For these kinds of questions, we build things called "equivalency files" or "correlation lists" that show the relationship between two sets of geographic codes. Using the geographic header data provided on the block-level records on the 2000 Census Summary File 1 series, we were able to build such an equivalency file that related all combinations of state-5-digit ZCTAs to counties in the U.S. (50 states + DC). We (the author, working under contract with the Missouri Census Data Center) stored the results in a tabular file in our data archive.  This file can be rather easily accessed via the web using our uexplore/Dexter interface.   You can browse the complete collection of our correlation list data files at  http://mcdc.missouri.edu/cgi-bin/uexplore?/pub/data/corrlst . From this page you can access the ZCTA-to-county correlation list data set by clicking on the entry us_stzcta5_county.sas7bdat (or, of course, you can take the shortcut to the Dexter application to access this data set by simply clicking on the data set name/link we just provided).

From here, just select the output format(s) of interest from Section I.  Choose HTML in addition to the default CSV format.  In  section II. Choose rows (observations) you can tell the application which rows you are interested in.  The rows in this data set correspond to ZCTAs crossed with counties.   Some typical filters that you might want to apply are:

There is no need to enter any filters -- if you want the entire data set.  There is also a box where you can enter a number that will limit the number of observations/rows on output.   To just see what the data looks like you might want to enter 100 in this box to see the first 100 rows of the table (any filters will  be applied first).  

Now scroll down the page to section  III. Choose columns (variables) .  We suggest you click on the box indicating that you want to keep ALL the columns.     Now go ahead and click on  the Extract Data button to run the query.   It should take a few seconds for your results to be displayed in the form of an Output Menu page that lets you view your multiple-output-file results.  Click on the HTML Report link to see your output in html format, or on the Delimited File link to see your delimited file (probably in Excel if your are using the MS IE browser and have Excel on your PC.)

Assuming you entered the Delaware filter (State  Equals  10) in  section II, your first output row should contain the following variables and values:
Clearly, what you have here is a tool for assigning county codes to your customer file using ZIP/ZCTAs as the link.   An analysis we did using  1990 census data indicated that if you assigned the primary county code to a record based on the ZIP code you would be right over 98% of the time.  
(See below for a discussion of an alternative resource that can help you find counties for ZIP codes that are not also ZCTAs.)

 

Geographic Header Files with ZCTAs

The Missouri Census Data Center has created a directory on their public census data server which has a complete set of geographic header data as distributed with the Summary File 1 data from the 2000 census. These header records have information about the geographic entities summarized on the SF1 data files. There are a great many such entities, and they range from state and county level records all the way down to 2000 census blocks. There are over 9 million of the latter entities nationwide. There are two reasons that people interested in ZIP/ZCTA geography may be interested in this collection.
  1. There are records (observations) on these data sets that contain data specifically related to the ZCTA-within-state and ZCTA-within-county summary levels (SumLev codes 871 and 881, resp.) These observations contain the 2000 population and housing unit counts from the census as well as the land and water area, an internal point latitude, longitude coordinates,etc.
  2. At the census block level (SumLev=101) all the other geocodes are reported (such as census tract, place, county subdivision, etc.) along with the ZCTA code. (Actually there are two fields/variables called ZCTA on these files/data sets: ZCTA3 and ZCTA5 -- when we reference ZCTA here we are referring to the full 5-digit version; 3-digit ZCTA's are just 3-digit ZIP code areas.)
You can access this collection of geographic reference files at: 
http://mcdc.missouri.edu/cgi-bin/uexplore?/pub/data/sf12000/xxgeos
.  (Right click and choose Open Link in New Window to see how this works without losing this page.)  There are two files per state, with the block level records stored separate from all the other geographic levels.  You will probably find the xxgeos files more useful, since they contain summaries for ZCTAs.   The block level files are useful for relating ZCTA geography to other levels, but for this kind of analysis you will probably want to use the MABLE/Geocorr2k application (see below), which makes use of a database that was built largely from these header files.

If you followed the link above to explore the xxgeos data directory, you can now click on the file degeos.sas7bdat .   To extract data for the ZCTA-related data you have to specify that you just want rows where the geographic summary level code (SumLev) is either  871 (ZCTA within state) or 881 (ZCTA within county).   This means that in section II. Choose rows you should select SumLev as the value of Variable/Column,   In List as the value of Operator  and then type in the value list  871:881 in the text entry box in the Value column.   In III. Choose columns  you should select the ID variables SumLev, State, County and ZCTA5  and all the numerics.   If you chose html as one of your output formats you will generate a report that looks like this:  

Listing of Extracted Data

Obs SumLev State County ZCTA5 AreaLand AreaWatr Pop100 HU100 IntPtLat IntPtLon LandSQMI AreaSQMI
1 871 10   19701 72747809 988990 31699 11757 39.598203 -75.699452 28.09 28.47
2 871 10   19702 71855667 454665 44836 17117 39.626297 -75.713864 27.74 27.92
3 871 10   19703 10075188 0 15312 7070 39.800945 -75.464550 3.89 3.89
4 871 10   19706 6615012 0 1623 629 39.573744 -75.592043 2.55 2.55
5 871 10   19707 31559283 53145 15209 5407 39.784014 -75.685864 12.19 12.21
6 871 10   19709 201315067 3173333 19583 6808 39.479602 -75.693201 77.73 78.95
7 871 10   19710 406732 0 38 13 39.788562 -75.588815 0.16 0.16
8 871 10   19711 73325522 0 55760 19375 39.700561 -75.743103 28.31 28.31
9 871 10   19713 34115597 0 31259 12577 39.669211 -75.717961 13.17 13.17
10 871 10   19720 99213137 174172 55539 21488 39.669219 -75.590030 38.31 38.37
11 871 10   19730 1215263 0 316 142 39.456484 -75.659768 0.47 0.47
12 871 10   19731 369987 4142 104 47 39.518164 -75.576564 0.14 0.14
13 871 10   19732 74349 0 54 34 39.794496 -75.574338 0.03 0.03
14 871 10   19733 159378 0 124 52 39.555794 -75.650585 0.06 0.06
15 871 10   19734 213869669 892861 5858 2185 39.386601 -75.668010 82.58 82.92
16 871 10   19736 2225469 2036 63 26 39.790911 -75.649341 0.86 0.86
.....rows omitted here ....
127 881 10 10001 19963 136695730 1114775 6512 2856 38.940868 -75.427962 52.78 53.21
128 881 10 10005 19963 126584164 1561166 8720 3858 38.902285 -75.409684 48.87 49.48
129 881 10 10001 19964 23170474 151343 1117 423 39.098772 -75.739431 8.95 9.00
130 881 10 10005 19966 228509679 991118 17768 11539 38.601355 -75.241103 88.23 88.61
131 881 10 10005 19967 2355508 0 462 258 38.545970 -75.111753 0.91 0.91
132 881 10 10005 19968 145702038 2032964 6552 3391 38.772648 -75.286658 56.26 57.04
133 881 10 10005 19970 25753027 2578 4481 3885 38.550440 -75.099282 9.94 9.94
134 881 10 10005 19971 53613166 288759 10085 12114 38.711512 -75.096772 20.70 20.81
135 881 10 10005 19973 201080075 1073931 21416 8759 38.643248 -75.611025 77.64 78.05
136 881 10 10005 19975 77474565 1709540 6408 4788 38.463751 -75.156423 29.91 30.57
137 881 10 10001 19977 172889529 6249294 10400 4158 39.290689 -75.588323 66.75 69.17
138 881 10 10003 19977 46264558 67923 3123 578 39.332977 -75.628061 17.86 17.89
139 881 10 10001 19979 6023487 0 592 232 39.046100 -75.571851 2.33 2.33
140 881 10 10001 19980 138082 0 75 22 39.070270 -75.570575 0.05 0.05
141 881 10 10001 199HH 0 529249574 0 0 39.069866 -75.413550 0.00 204.34
142 881 10 10005 199HH 0 654261354 0 0 38.655425 -75.174584 0.00 252.61

 

ZIP Code Master Dataset and State ZIP Code Directories

This is an excellent resource if you are looking for basic geographic information about ZCTA's (2000 vintage for now) and ZIP codes. The emphasis is on names associated with the codes (both "preferred" and alternate) and location (city, state, county). Latitude, longitude coordinates are also provided. Access the data at http://federalgovernmentzipcodes.us/, where you can download it in csv, Excel or mySQL format. The data were revised using USPS updates through 11-15-10. The Contact link on this page still does not work but this one does. (Note that while the name seems to indicate otherwise, this is not a federal government web site.)

The files have the following key features:

Relating ZIP Codes to Counties

Are we repeating ourselves? Didn't we already deal with this above in the section on ZCTA's? No, in that section we talked about relating ZCTA's to counties, not ZIP codes. As already noted, there are important differences. ZCTA's get old (frozen at the time of the latest census) and do not iclude proxies for non-residential ZIP codes such as P.O. Box-only ZIPS and "unique" ZIPS assigned to large companies or other organizations. These caveats can easily result in being unable to link 10% or so of your ZIP code file when using the tools described above for ZCTA's. We now have an alternative source that can help us get a more complete list. That source is the ZIP codes master file described in the previous section. It contains data for real ZIP codes, not ZCTA's, and it contains a field identifying the county in which the ZIP is (all or mostly) located. Unfortunately, it only contains the name and not the FIPS code for the county. So it's not the perfect solution. But it should be good enough for a lot of applications.
To use the master ZIP codes file on a one-at-a-time manula basis you can simply generate one of the directories we pointed to above and do a manual lookup of each ZIP. To automate the process you will need to generate a file containing the ZIP code and the county. The tool for doing that is once again our friend Dexter. We'll spare you the mini-tutorial and simply point you to a pre-defined query that you can access to generate the extract. Go to the pre-defined query file. You should be viewing a Dexter query form page that has been pre-coded to access the zipcodes dataset in the georef data directory. The query as defined requests output in the form of a csv file (no report file, no database file); no filtering (you get the entire country); and relevant variables selected (you can ask for more, or less by simply modifying the select lists in section III of the form). You can, if you want and you know how, modify the query any way you want. But all you have to do is click on one of the Extract Data buttons. Then when your blue output menu page is displayed you'll need to click on the link(s) to your output file(s). The only hard part will be figuring out how you will use the resulting "lookup table" file to do the actual encoding.

 

MABLE/Geocorr: Generating ZCTA Geographic Equivalencies

The MABLE/Geocorr web application (at  http://mcdc.missouri.edu/websas/geocorr2k.html)  is an updated version of the original application (at http://www.oseda.missouri.edu/plue/geocorr/htmls/geocorr3.html ) .  Both applications do essentially the same thing, but the newer version uses census 2000 geography and later, while the earlier version used 1990 geography.  The new version also does what it does about 7 or 8 times faster than the old version because it runs on a much faster server.   What it does is allow you to dynamically generate files and reports that show how various geographic layers are related to one another.   For example, you can choose one or more states as your geographic universe of interest and then ask the program to show you how ZCTAs within those states relate to just about any other geographic layer you can think of.  Want to see how ZCTAs relate to counties?  Easy.  How about census tracts, school districts,  108th congressional districts or the latest urbanized areas?   All easy.  

To see how easy it can be, follow the link above to the geocorr2k application.   Choose your state from the first select list.  Then select 5-digit ZCTA (ZIP Census Tab. Area 2000) from the SOURCE Geocode(s) select list (on the left),  and Congressional District 108th (2002) from the TARGET Geocode(s) select list (on the right).   In the Output Options section, under Listing File, select html as the value for the format of this file (overriding default of plain text).    In the text box for Title type a title for the report such as "ZIP to Congressional District Equivalencies Using MABLE/Geocorr2k" .  Then click  the Run Request button. If you chose Maryland as your state it will take about 3 seconds to process about 60,000 census blocks (we also clicked the option box that said to Ignore Census Blocks with a value of 0 for the weighting variable or we would have processed more).   Clicking on the link to your Listing (report format)  file will let you see a report that looks just like this one that we generated for Maryland.  

Could I do the same thing I just did for one state for the entire U.S.?   Well, yes, but it will take more time and resources.  There are about 9 million block records in the MABLE2k database and they all have to be accessed to generate such a report/file.  The web application has a built in time limit which is easy to exceed when doing national queries.   So you might have to break it down and do about 10 or 20 states at a time.   We tried it with this particular request (it is much faster to do congressional districts  than it would be to do something smaller such as census tracts or places, and we did it on a Saturday morning when the server was not very busy.)   Because we were now doing multiple states we changed our Target geocodes selection to include State as well as cd108  (we chose it as a target rather than a Source geocode so that our output would be sorted by ZCTA rather than state; obviously, you might want to go with the other option.)   Our  attempt succeeded in producing a nationwide ZCTA to CD Equivalency file for the entire US in html format.  However, the file was so large (about 32 meg and over 50,000 lines) that when we tried to load it with IE it hung the browser.  We had better luck with Netscape but after several minutes it was still loading, though it was letting us view the top lines while we waited.    The moral here may be that just because sometimes you can do it, doesn't mean you should.    It might be better to break it down into smaller parts and/or request csv (comma-delimited ASCII)  files rather than html or pdf reports.  If you want to see what we got and test your browser's ability to handle over sized files you can attempt to browse that national file here.  

In the previous version of this resources page we had a section that provided links to a series of national ZIP equivalency files.  We have decided not to try and reproduce those files for 2000 because we feel that with MABLE/Geocorr2k it is easy enough to create such entities on the fly and get exactly what you need.  


Relating ZIPs & ZIP Suffixes to County & Metro/Micropolitan Statistical Areas

We happened to stumble upon a curious resource on a Census Bureau web page dealing with definitions of Metropolitan and Micropolitan Statistical Areas. It was under a section titled Geographic relationship files, with a downloadable file titled 2007 ZIP code to 2006 CBSA. To make a long story short, we downloaded and converted the file. The result looks something like this:

.

Perhaps not the best example, since this ZIP code is entirely within a single CBSA. But it does cross county boundaries, being on the border of St. Louis City (a county equivalent independent of St. Louis County). The thing that makes this data resource special is the use of sub-ZIP code geography at the two and four-digit ZIP-suffix levels. It varies by ZIP code. All of this information is stored in the MCDC public archive, where it can be accessed via the Dexter extraction utility. It can be accessed as data set zip07_cbsa06 in the corrlst data directory. Be sure to take advantage of the link to "Detailed metadata" at the top of the Dexter query form.

You can use Dexter to print out directories to be used in manual assignments, or you can download the data in csv of SAS data set format and do your own automated processing.


Demographic Data for 2000 ZCTAs

Tabular Data on 2000 Summary Files

The Census Bureau created detailed demographic summaries for ZCTA's (both complete and within county) as part of their 2000 Summary File 1 and Summary File 3 data products. These are very large collections of detailed tables that you might have occasion to use if you have a specific item of interest that requires you to go deeper than what most users will want to go. There are, for example, over 16,000 cells of tabular data for every ZCTA on Summary File 3. You're probably going to want this boiled down to something more readily accessible.

Fortunately, you will probably never have to get involved directly with the Summary Files. Both the Census Bureau and the Missouri Census Data Center (MCDC - where the author works) have created demographic profile products which take these thousands of data tables cells and boil them down to a few hundred key data items, which are then presented in easy to read reports. You can view these data one ZCTA at a time in your browser, or you can access data files that have the boiled-down data available for all ZCTA's in formats that can be readily loaded into a spreadsheet or database access package (e.g. Excel or Access). Doing this is going to require that you become familiar with either the American FactFinder access tool (to access the Census Bureau profiles, aka "Quick Tables") or the uexplore/dexter software (to access the MCDC's data sets). Neither tool is terribly difficult to use but it does mean you have to invest a little time before you can access that first set of data.

The MCDC's Demographic Profile Reports and Data sets

There are actually two sets of demographic profiles at the ZIP/ZCTA level based on the 2000 Census from the Missouri Census Data Center. But the collection based on Summary File 1 (which means complete count, short form data) is very limited in demographic detail and will rarely be what you want. When you want the "good stuff" - data on income, poverty, educational attainment, etc. - you want Summary File 3 (which means sample data based on the long form) and this translates into what the MCDC calls the dp3_2k (Demographic Profile 3, 2K Census) product. You can access these products using various entry points. If all you want to do is go straight to the main menu for the ZCTA profiles then you want the U.S. ZCTAs main menu page. If, however, you want to start at the beginning you can go to the Main menu page for the dp3_2k application.

You follow the menu pages for the dp3_2k application until you finally click on a link to the 5-digit ZCTA itself and in a second or two (it takes a little bit because the report is dynamically generated from a database) you will be presented with a longish 2-column report in html format. The profile is divided into 29 sections (topics). The header line for each of these sections is a hyperlink to the metadata for that topic. There are many hyperlinks on the report page. An important one to follow the first time you use the application is to the Usage notes page (the link is at the top of the report, just below the name of the area being summarized.) The Usage Notes page contains a lot of background information as well as tips on how to use and interpret the data are provided. Among the things you will learn if you read the Usage Notes page carefully:

Access "Quick Tables" Data Via American FactFinder

You can also access 2000 census data at the ZCTA level via the Census Bureau's American Fact Finder web application. Not just data, but reference and thematic maps as well. Select the "Data Sets" option at the top left of the page, and select "Census 2000 Summary File 3" as the data set. Then you should be able to choose your geography, including 3- or 5-digit ZCTAs. The Census Bureau calls their profile data sets "Quick Tables" so you should select this as your Tables option when you want to get just the most frequently used data. But you can also access the complete Detailed Tables if you would like. For many geographic units (states, counties, cities, congressional districts, for example) the Bureau makes available these data in the form of 3-page pdf files/reports. Unfortunately, these report format files are not available for ZCTA geography. See the MCDC's overview page for these products.

Looking forward: ZCTA Data from the American Community Survey

The news here is not so good. What most people and organizations (with the notable exception of the private data companies who use current ZIP code estimates as reliable cash cows) would love to have is reliable and more current demographic/economic data from the Census Bureau summarized for current ZIP codes (or ZCTA proxies thereof). Unfortunately, the Bureau's new
American Community Survey delivers neither the updated ZIP-related (i.e. ZCTA) geography nor much in the way of updated data for the existing ZCTAs. The ZCTA geography remains frozen in time as of 2000 -- there are no "2007 ZCTAs" (for example). The concept makes sense, but the logistics require the Bureau to have complete block level data and this they do not have. What they do have, of course, is the old 2000 vintage ZCTAs and data could be published for these areas. But the ACS has a population threshold of 65,000 before a geographic area will have summary data published based on a single year of ACS survey data. Starting in 2008 data was released for areas as small as 20,000 population be based on the most recent 3 years of survey data combined (i.e. years 2005-2007). While there are there will be a few ZCTAs that qualified for such data the Bureau decided that it would not deliver any ZCTA level data until 2010 when we can expect to see 5-year period estimates based upon combining surveys for the years 2005-2009. By 2011 we'll have new block level data from the 2010 census and presumably we'll have new 2010 census data at the "ZCTA 2010" level. But unfortunately, the 2010 census data will be short-form only (meaning just basic demographic counts mostly related to age, race, Hispanic and household composition) and it is not yet clear what will happen to 2010 ZCTA summaries in ACS (will we have to wait another 5 years before we get any data for these "new" geographic entities?)

So when will we be able to get a relatively current source of long-form (e.g. income, poverty, education, house values, etc) data for somewhat current ZIP codes again? The over-simplified answer, though not completely certain, appears to be "never". You will be able to get 5-year data for 5-or-more-year-old ZCTA geography some time after 2010 that will come with rather large standard errors for the smaller ZCTAs; but the days of having data like we got from SF3 in 2002 are apparently gone. That's the bad news. The "good" news is that the not-nearly-as-good ACS 5-year data will be available every year once it starts flowing. It won't be much good for those wanting to find out what changed in the latest year, but at least it will be "new" data. It will probably do a better job of describing the true current state of a ZIP code than the 8-year-old data that we have now from the 2000 census.

 

Links to Other ZIP-related Resources


More...

John Blodgett     blodgettj@missouri.edu
Office of Social and Economic Data Analysis (OSEDA)
626 Clark Hall   / University of Missouri   / Columbia, MO 65211

John Blodgett's Home Page   |   Missouri Census Data Center   |   OSEDA

URL: http://mcdc.missouri.edu/webrepts/geography/ZIP.resources.html