This page is a guide for researchers interested in U.S. postal geography codes (ZIP codes) and their relationship to the Census Bureau's ZIP Census Tabulation Areas (ZCTAs).
The U.S. Postal Service (USPS) created these Zone Improvement Plan codes to facilitate more efficient mail delivery beginning in the 1960s. Since then, ZIP codes have been widely adopted as a standard geographic area for other purposes, such as marketing or research.
ZIP codes are a very messy kind of geography. Maps of ZIP codes suggest they are spatially defined areas with precise boundaries, similar to counties. But, from the perspective of the USPS, ZIP codes are not and never have been spatial entities. They are simply categories for grouping mailing addresses.
ZIP codes do in most cases resemble spatial areas, since they comprise spatially clustered street ranges — but not always. In rural areas, ZIP codes may be collections of lines (rural delivery routes) that don't look much like a spatial area. In areas where there is no mail delivery (deserts, mountains, lakes, large areas in Nevada and Utah) ZIP codes are undefined. An area is not assigned a ZIP code until there needs to be mail delivered there. So the actual definition of a ZIP code "boundary" is quite fuzzy at best.
In short, ZIP codes are based on points, not areas, and are therefore unsuitable for mapping and many analysis applications.
Moreover, ZIP codes don't conform to any other geographic schemes. Most geographic units are part of some hierarchical system (such as the Bureau's own nation → state → county → tract → block group → block hierarchy). ZIP codes may cross state lines (rarely, but just enough to cause some problems and confusion), county lines (about 10% of ZIPs are in more than one county), political jurisdictions, metro areas, and so on.
ZIP codes change over time, occasionally quite dramatically but usually small and subtle. When a ZIP code changes its definition, it does not change its name (like a census tract).
For example, the 63301 ZIP code (in St. Charles county, Mo.) in 1985 was broken into first two and then three ZIP codes (63301, 63303, 63304). So, ZIP 63301 today represents about a third of the area of ZIP 63301 in 1985. One of the new codes, 63303, subsubsequently changed so that it now represents about half of the area it initially included.
This means that ZIP codes are poorly suited for doing any kind of time-series analysis — unless you have some way of keeping track of all the changes over time. Using the example above, you may conclude that there has been a dramatic downward population trend in 63301 since 1980 (in fact, just the opposite is true).
Unfortunately for researchers, the Census Bureau tabulated the results of the 1990 and 2000 decennial censuses using ZIP codes as they were defined at the time they were prepared. For these censuses, the Bureau created a data product called the ZIP Block Equivalency Files that defined which census blocks were used to approximate the ZIP code areas.
However, these equivalency files were inexact. Census blocks frequently comprise more than one ZIP code. Typically, ZIP code "boundaries" fall along back lot lines; blocks almost always split down the middle of streets. As a result, blocks near the boundaries of ZIP codes typically split ZIP codes.
Consider the classic rectangular city census block — an area bounded by portions of four city streets. Each of those street faces has its own street name, address range and ZIP code, and it's quite common for the ZIP codes for the four streets to not all be the same. When this happens, the Census Bureau assigns the entire block to a single ZCTA for purposes of tabulating the census. For example, this typical block might have 12 households with 40 persons living in ZIP 11111 and six households with 15 persons living in ZIP 11112; for census tabulation purposes, all 18 households and 55 persons will be tabulated as part of ZIP 11111.
This process of converting ZIPs to blocks caused rounding errors. These errors are unbiased and may cancel each other out to some extent, but they were still an important source of potential error in the 1990 and 2000 census data.
The Bureau now uses ZCTAs as an alternative to ZIP codes for publishing data. A five-digit ZCTA is typically nearly identical to a five-digit USPS ZIP code, but there are important distinctions. The Bureau's ZCTA page explains some of the differences.
Note that ZCTAs were new in 2000; there were no earlier ZCTAs for doing any time-series analysis. The Bureau published results of the 2000 Census aggregated to these geographic units on Summary Files 1 and 3. Unlike the ZIP codes used for tabulating earlier censuses, these ZCTA areas are spatially complete and you can easily do mapping with them. ZCTA boundary files are available from the Census Bureau's cartographic boundary files page.
In the MCDC's zcta_master datasets, each row corresponds to one state/ZCTA combination. The columns describe the ZCTAs in several ways. To see a complete detailed listing and explanation of these columns, see the metadata page. Most of the columns provide information about what geographic entities intersect with the ZIP (ZCTA). Here is a display of one row (observation) from the zcta_master data set, showing the entry for ZIP/ZCTA 65201.

For this example, we used a filter of "zcta5 equal to (=) 65201". The output shows that 65201 is primarily in Boone county, MO (FIPS code 29019) and secondarily in Callaway county (FIPS code 29027). The pctcnty variable tells you that 99.8% of the people who lived in this ZCTA (at the time of the 2000 census) also lived in the primary county (Boone). The ZIP is mostly (78.4%) in the city (place) of Columbia, which has a FIPS code 15670 (variable PlaceFP). The variables PlaceFP2 and pctplace2 show that the remainder of the ZIP (i.e. the part that is not within the city of Columbia) is in an unincorporated area.
Another series of codes identifies various metropolitan and urbanized areas associated with the ZIP. 65201 has 89% of its population living in the Columbia, MO urbanized area and 99.8% in the Columbia MSA.
We have a number of spatial data items associated with the ZCTA: land area in square miles and the "internal point" coordinates (centroid) of the ZCTA. Finally, we have a series of key demographic and economic status indicators taken from the census tables: total population and housing units, median household income, mean poverty ratio and average housing value. Each of the last three have the actual value as well as a corresponding index value that tells you how it compares to the value for the entire country.
Go to the Uexplore home page. Choose Geography/GIS as the major data category, and then choose georef as the filetype. This takes you to Uexplore referencing our geographic reference data sets collection. Click on one of the zcta_master dataset names to select it and to invoke Dexter. Follow the detailed metadata link at the top of the Dexter query form.
If you just want to get all rows and variables, just go to Section III. Choose Variables and check the box that says you want to keep all the variables. The result is a CSV file that is quite large. We have stored a permanent copy of this CSV file in the georef directory and you can access it directly here.
Some users may need to identify the state/county for a particular set of ZIPs. For these kinds of questions, we build equivalency files or correlation lists that show the relationship between two sets of geographic codes. These files are available via our Uexplore interface, or browse the complete collection of our correlation list data files here. Access the ZCTA-to-county correlation list dataset from this page as well.
From here, just select the output format(s) of interest from Section I. Choose HTML in addition to the default CSV format. In Section II. Choose rows (observations), you can tell the application which rows you are interested in. The rows in this data set correspond to ZCTAs crossed with counties. Some typical filters that you might want to apply are:
There is no need to enter any filters, if you want the entire data set. There is also a box where you can enter a number that will limit the number of observations/rows on output. To just see what the data looks like you might want to enter 100 in this box to see the first 100 rows of the table (any filters will be applied first).
Now scroll down the page to section III. Choose columns (variables) . We suggest you click on the box indicating that you want to keep ALL the columns. Now click on the Extract Data button to run the query. It should take a few seconds for your results to be displayed in the form of an output menu page that lets you view your multiple-output-file results. Click on the HTML Report link to see your output in HTML format, or on the Delimited File link to see your delimited file.
The MCDC data archive includes complete sets of geographic header data. These header records have information about the geographic entities summarized on the SF1 data files. There are a great many such entities, and they range from state and county level records all the way down to census blocks.
This is an excellent resource for basic geographic information about ZCTAs and ZIP codes. The emphasis is on names associated with the codes (both "preferred" and "alternate") and location (city, state, county). Latitude and longitude coordinates are also provided. Access the data at http://federalgovernmentzipcodes.us/, where you can download it in CSV, Excel, or MySQL format.
The files have the following key features:
ZCTAs get stale (they are "frozen" at the time of the latest census) and do not include proxies for non-residential ZIP codes (such as PO-box-only ZIPS and "unique" ZIPS assigned to large companies or other organizations). These caveats can easily result in being unable to link 10% or so of your ZIP code file when using the tools described above for ZCTAs. The ZIP codes master file described above contains data for real ZIP codes, not ZCTAs, and it contains a field identifying the county in which the ZIP is (all or mostly) located. Unfortunately, it only contains the name and not the FIPS code for the county.
To use the master ZIP codes file on a one-at-a-time manual basis you can simply generate one of the directories we pointed to above and do a manual lookup of each ZIP. To automate the process you will need to generate a file containing the ZIP code and the county. The tool for doing that is once again Dexter. Access the zipcodes dataset in the georef data directory. The query as defined requests output in the form of a CSV file (no report file, no database file); no filtering (you get the entire country); and relevant variables selected (you can ask for more, or less by simply modifying the select lists in section III of the form). You can, if you want and you know how, modify the query any way you want. But all you have to do is click on one of the Extract Data buttons. Then when your output menu page is displayed, you'll need to click on the link(s) to your output file(s). The only hard part will be figuring out how you will use the resulting lookup table file to do the actual encoding.
The Geocorr applications allow you to dynamically generate files and reports that show how various geographic layers are related to one another. For example, you can choose one or more states as your geographic universe of interest and then ask the program to show you how ZCTAs within those states relate to just about any other geographic layer you can think of.
The Census Bureau created detailed demographic summaries for ZCTAs (both complete and within county) as part of their decennial summary file data products. These are very large collections of detailed tables. In addition, both the Census Bureau and the MCDC have created demographic profile products that take these thousands of data tables cells and summarize them into a few hundred key data items.
MCDC offers demographic profiles at the ZIP/ZCTA level based on decennial censuses.
Other ZIP-related Resources