|
|
Extract Data from the MCDC / OSEDA Public Data Archive
Using the Uexplore/Dexter Web Software
Rev. 08/19/2008
See important note about the nature of and intended audience for this application.
Uexplore Application Description
|| On-line Tutorials
|| xsamples
Archive Directory
|
| Major Category Index |
|
Decennial Census Data:
2000 | 1990 | 1980
American Community Survey
||
Pop. Estimates
||
Economic
Indicators ||
Geography/GIS
||
Compendia ||
Other
|
|
Recent Updates to the Archive 
- 08/19/08: The latest data from the IRS indicating migration flows based on 2006/2007 tax returns have been added to the
irsmig directory. In addition to the SAS datasets for the entire country, we have copies of the original xls files for Missouri.
- 08/13/08: The economic and social indicators for Missouri counties stored in the
cntypage directory were updated to include more recent data. Specifically, data from the Census Bureau's estimates program (including age, race and hispanic detail) were added, as well as economic data from the BEA REIS program, the latest official population projections and several other items. You can access all the data in the merged08 dataset. You can also access the data in formatted report form by going to the
OSEDA County SEIR page.
- 08/07/08: The latest county population estimates with demographic detail by age, sex, race and hispanic were added to the popests directory. These are the "casrh07" datasets, and there is one per state. The Missouri data are in the mocasrh07.sas7bdat file. The casrhalt subdirectory contains alternate versions of the data in the format as released by the Bureau.
- 07/11/08: The latest sub-county population estimates added to the popests directory.
The new datasets are ussc07.sas7bdat (estimates for cities, MCDs, counties, states for the entire U.S.) and mosc07.sas7bat, the Missouri subset of ussc07. Includes
estimates for July 1 of each year from 2000 to 2007.
- 06/27/08:
County Business Patterns data updated with data through 2006 have been downloaded and converted in the
cbp directory.
- 06/23/08:
Taxable sales data for Missouri for 2007 have been added in the taxsales directory.
- 06/16/08:
Data for June, 2007 was added to the bankdeps collection of bank deposit figures for branches in Missouri.
- More...
|
|
|
Decennial Census 2000
[Return to top of page]
sf32000x/
Standard Extract based on Summary File 3, 2000 decennial
census. Our most popular (frequently accessed) filetype, by far. In
these data files we have compressed the 16,000+ cells of tabular data on
a full sf32000 summary record down to a summary consisting of just a few
hundred key items. These files are the basis for our
dp3_2k profile reports.
This file type closely
resembles (and is understandably sometimes confused with) the
sf3prof filetype (below), which is the Census Bureau's standard
extract from the same basic data source. But there are important
differences; perhaps the most critical is that these data are available
for a much wider array of geographic entities, down to the block group,
than are the sf3prof data, available (on this site) only for governmental units.
Be sure to look at the Variables.pdf
file which provides an excellent overview of the data items contained in
these datasets.
sf32000/
Summary File 3, 2000 decennial census. This is probably the most
widely used of the summary data files produced by the Census Bureau.
"SF3" contains detailed tables based on responses to the long form
questionnaire. Here is where you can find data on topics such as
income, poverty, housing value, occupation, education, etc. These
data are available for a wide array of
geographic units. The MCDC has data down to the tract and block group levels for at least Missouri,
Illinois, Kansas Minnesota, Michigan and Delaware (state files.) We also have the
final National file with data for every state, county, ZCTA, place, UA,
MSA, etc. in the U.S. Be sure to read the Readme
file in this directory for a good overview.
sf12000x/
Standard extract from the full 2000 Summary File 1 data sets. Basic
demographic counts based on the short-form census questionnaire. One of
the few places where you can find census block level data. For an
overview, see the Readme.html
file.
sf12000/
Summary File 1, 2000 decennial census. The first detailed set of
tables from the 2000 U.S. Census, derived from responses to the short
form questionnaire. Does not have data based on
long-form questions regarding items such as income, housing value,
occupation, etc. Those items are on sf32000, which was released in the
summer of 2002. For an overview of SF1, review Readme.html.
sf3prof/
If you are familiar with the Census Bureau's DP1 to
DP4 demographic profile report products, here you will find all the
data that goes into them. We have one file per state/profile and a
national collection with higher level summaries. You get to choose
between the variable names that came with the data from the Census
Bureau or MCDC-assigned mnemonic names (e.g. v23 is the Bureau
name, Over65 is the mnemonic name). Percent variables as used in
the DP reports (but not included in the data files distributed) have
been added to the datasets. This is a very well-documented and
value-added collection. Geographic entities summarized are mostly
governmental units: states, counties, places, MCDs, metro areas, 106th
congressional districts and American Indian reservations. We have a
complete collection of data for all states and US summaries.
pums2000/
Public Use MicroSample files. These files are terrific if you
have good statistical software and know how to use it. With PUMS, you can build
tables any way you like it from these datasets which contain actual
microdata (census returns from individual persons and households). Geographic detail
is limited (to special geographic areas called PUMAs. The Census Bureau
releases these files in two product types, a 1% sample file and a 5% sample.
The MCDC collection includes a complete collection of 5% sample files for all states
and a smaller collection of the 1% sample files (Missouri, Illinois and Kansas). SAS
users at universities may contact the MCDC regarding being obtaining access to this collection
directly on the University of Missouri site using a special server.
ctpp2000/
Census of Transportation Planning Package. These files represent a special tabulation of the 2000 Census long-form data for use in transportation planning applications. As such, the tables have relevance to commuting information such as when people leave home for work, how they travel to work, how long it takes to get there, etc. There is also some custom geography found on thees files such as TAZ's (Transportation Analysis Zones) and MPOs. There are 3 parts to this collection: Part 1 provides table summaries based on where people resided; Part 2 is based on where people worked; and Part 3 deals with dual geography, giving characteristics of commuters for specified origin-destination geographic combinations. These data were prepared by the Census Bureau using specifications from the US Dept of Transportation, who are the distributors of the data and who are responsible for its content. We have downloaded and converted files for Missouri, Illinois and Kansas only.
ctppx/
Census of Transportation Planning Package 2000 standard extract.
These extracts are derived from the datasets in the ctpp2000 complete-tables collection.
cqr2000/
"CQR" (Count Question Resolution) was the Census Bureau's program to identify errors in the total population and housing unit counts in the 2k census. This directory has datasets that capture those adjustments at both the census block level (one dataset and csv file per state) and at the governmental unit (state, county, place, mcd) levels.
daytmpop/
"Daytime population" special tabulation.
Attempts to estimate the number of people who may be in an area on a typical work day.
mig2000/
Datasets in this directory are related to migration in the U.S.
between 1995 and 2000 as derived from the 2000 decennial census long
form (sample) data. The Census Bureau released a number of different
summary files in this category, but our collection contains only the
basic counts of movers. This is a national collection.
eeo2000/
This loosely affiliated collection of tables (datasets) is named for
the Equal Employment Opportunity Commission, 1 of 4 federal agencies
that commissioned this special tabulation product. Here you will find
detailed counts of persons by occupation categories by age, sex,
race/ethnicity, education level, income level and even, sometimes, by
industry. Hard to describe since it is a very complicated collection of
24 tables, each with its own geographic, demographic and occupational
dimensions. We tried very hard to make this collection simple
to access, but the task proved to be impossible.
workflow/
County to county work flows. This special tabulation file is based on
sample (long form) data from the 2K census. It gives you the count of
persons who live in County R and work in County W for all counties in
the country. The MCDC has created some
custom reports based on these
files, and has created a web application that accesses the data for any U.S. state.
sf42000/
See the Census Bureau's Abstract
for a description of this data collection. The key feature of "SF4" is
the ability to get detailed tables for a long list of race/ancestry
groups. However, new threshold limitations (explained in the Abstract)
make using it for analytical purposes very problematic. The large number
of tables combined with the large number of characteristic
iterations makes these files huge. Because of the enormous size
and complexity of this collection, we strongly recommend that users new
to the collection begin by accessing the SF4 data using the American
FactFinder application, the Data Sets option. It tends to be a lot
easier than accessing via Dexter.
sf22000/
Summary File 2, 2000 decennial census. Important for those who are
interested in detailed complete count data for special
race/hispanic/Indian tribe population groups. Less important than in
previous censuses because so much of the data here was already released
as part of SF1. The special subgroups ("characteristic
iterations") are represented on separate observations, identified by
the ID variable CharIter. See our SAS
format code showing the values of these codes. See the Census
Bureau's SF2
page for more information, including access to the data via American
Fact Finder. Data for the population subgroups is only present
for a geographic area if it meets the threshold criteria of at least 100
people in that category.
pl942000/
Public Law 94-171 (Redistricting data). This was the first data
published based on the 2000 decennial census. Contains basic pop counts
by race/Hispanic and voting age for a wide variety of geographic levels
(including VTDs - Voting Tabulation Districts), down to census block.
The MCDC has a complete national collection of these data. For an
overview see the Readme
file.
hudcdbg/
These Community Development Block Grant data are derived from 2000
Census data by the U.S. Dept of Housing and Urban Development (HUD).
The datasets contain information regarding low and moderate
income housing down to very small (block group) geographic areas.
These numbers are used by local developers and planners wanting to
qualify for special grant funding in neighborhoods. See the HUD
web site where we went to download these data, and where you can
download it in the form of Excel spreadsheets for any state in the
country. Our collection is limited to Missouri, Illinois and Kansas.
Decennial Census 1990
[Return to top of page]
stf903x2/
1990 Summary Tape File 3: standard extract. Same basic idea
as stf903x, but this filetype was created to be comparable to the
SF3-based extract data for 2000 (filetype sf32000x). Data from these sets are used in the
dp3_2kt
trend reports.
stf903/
1990 Summary Tape File 3. Each dataset contains over 3300 cells of pre-tabulated data based on the 1990
census long-form questionnaires. Each observation contains data for a single geographic area. We have complete "A" files for Missouri, Illinois and Kansas plus a few other states; we also have the complete "C" file (national) with summaries for the country, states, counties and larger cities. And, we have the "B" file - ZIP level summaries. This filetype has been made accessible at the table level from Dexter. As with any of the census summary file filetypes, you really need to have access to the technical documentation -- available in the stf903/Docs subdirectory of this archive -- before attempting to use these data. The stf903x and stf903x2 filetypes are
derived from these files and are appropriate for quick overviews or access to frequently-used
variables. This data collection was substantially restructured in early 2005.
stf903x/
1990 Summary Tape File 3: standard extract (used for Basic
Tables web reports). These datasets are faster and friendlier to access than the much larger stf903 sets from which they are derived. See also the stf903x2 alternative filetype, above.
stf904/
These are very large 1990 census summary files, featuring large multi-dimensional tables and separate files that summarize subpopulations based on race and/or hispanic origin. We have File A and File B for Missouri only
and all of File C (the national file). We have B Table files for total pop, the 5 basic race groups (White, Black, American Indian+, Asian & PI, Other),
the 5 basic groups/non-hispanic and Hispanic (12 "chariters" - characteristic iterations).
We have added labels to the variables in these datasets (in December, 2005) and made them accessible at the table level via Dexter. The complete technical documentation is accessible from the Docs subdirectory.
stf901x2/
1990 Summary Tape File 1: standard extract 2: Specifically designed
to be used as the 1990 equivalent of the data in filetype sf12000x. Many
sets in this directory have been re-tabulated to 2000 census geography
to allow for direct trend reports using comparable geography.
stf901/
1990 Summary Tape File 1 is the 100-percent file on population and housing. No social and
economic indicators are associated with this file.
stf901x/
1990 Summary Tape File 1: standard extract
stf902/
1990 Summary Tape File 2.
stf420/
Place of Work Destinations File. The comparable filetype in 2000 is Workflow. (The name comes
from the fact that it is based on Table 20 on Summary Tape File 4.)
pums90/
Public Use Microdata Sample, 1990. We have 5% files for Mo, Il & Ar and the 1% files for the
entire U.S.
stf9s5/
1990 Special Tabulation File 5: Commuting Patterns by county
eeo90/
1990 Equal Employment Opportunity file.
stp154/
Special Tab Product 154 (1990 census): Commuting patterns by Place-tract ("Daytime population" file)
stp28/
Special Tab Product 28: County to County Migration 1985-1990. One of
our "specialties".
pl9490/
1990 Public Law 94. A few variables about lots of geographic areas
(aka "redistricting file")
pl9490tx/
Public Law 94-171 (Redistricting data) from 1990 Census -
special extracts used (along with data from the pl942000 directory) in
creating PL94
Basic Trend reports.
cqr90mo/
Census Quality Review data: documents geographic errors and fixes in
1990 census geography. We have just the single txt file for state of Missouri.
Decennial Census 1980
[Return to
top of page]
stf803/
1980 Summary Tape File 3: Complete national collection with "A" files for every state
as well as a national ZIP code file and the "C" file with higher level geographies for the
entire U.S.
stf803x2/
1980 Summary Tape File 3 standard extract, revised. Same idea as the stf803x standard extract but this is by far the better collection. Content was chosen to be as compatible as possible with similar extracts from the 1990 and 2000 extracts. Complete national collection.
stf803x/
1980 Summary Tape File 3 standard extract. (Needs work - suggest using stf803x2 instead).
stf801/
1980 Summary Tape File 1
stf801x/
Extract 1980 Summary Tape File 1
marf2/
Master Area Reference File 2, 1980 . These files provide a geographic inventory for the 1980 census. Each record/observation describes the geographic codes associated with a 1980 geographic summary area, together with some basic population and housing counts, and pci (Per Capita Income - 1979). For some geographic levels there are internal point latitude-longitude coordinates and/or land areas.
The data are similar to the geographic headers portion of the 1980 Summary Tape File 1 files. Geographic units summarized are states, counties, MCDs (county subdivisions), places (cities), census tracts (BNA's), block groups and enumeration districts. Be sure to check out the extensive Readme.html file.
pums80/
Public Use MicroSample data, 1980 . Microdata data files with long-form census response data from that census. Each record/observation describes either a person or a household. A data dictionary file is included and the Tools library should be very useful for anyone wanting to access these data using SAS.
American Community Survey
[Return to top of page]
acs/
Destined to become one of the most important filetypes in the archive. The American Community Survey is being phased in over this decade and is intended to replace long-form (sample based) decennial census data by 2010. This generic directory is only for holding materials (such as the Readme page) that are about the ACS in general. Date for specific years are stored in the acs20YY directories (below) or in the acspums directory ("filetype").
acs2006/
These data were released in the summer of 2007. They are the
tabulated results of the surveys for the 2006 calendar year. Unlike the
2005 tabulations these include persons in group quarters as part of
the universe. These data are still limited to geographic entities of
65,000 or more population. Our collection includes data for the entire U.S. and includes completed
detailed (base) tables as well as profile datasets that are similar to the data found in the Census
Bureau's profile reports accessible via FactFinder.
acs2005/
Data tables from the ACS for the calendar year 2005 are summarized here. These are the first
substantial set of data tables ever to appear based on the ACS. Summaries are for geographic areas
of at least 65,000 population. The group quarters segment of the population was not covered in the 2005
survey so all figures here summarize just the household population. There are no moving averages
here, just tables based on a single year of surveys. These data were released in "waves" during the summer and fall of 2006.
acspums/
This is American Community Survey's 1% Public Use MicroSample data. These data are only of direct interest to researchers with access to and skill using a statistical software package. We have a complete national collections for 2004, 2005 and 2006.
Population Estimates
[Return to top of page]
popests/
More recent population estimates and projections from many different sources, for
many different geographic areas and units. More for Missouri than for
elsewhere but some good state and county level stuff for the entire
country. Some with historical trends, some with components of change.
These are all post-2000 estimates (along with a very small number of projections), with
one key exception. (Note: for the latest Missouri county level projections, see the moprojs filetype, just below.)
nchsbri/
This popests subdirectory contains special estimates commissioned by NCHS (National Center for Health Statistics)
using "bridged" race categories,
i.e. using race standards established by OMB in 1977 rather than the current ones established in 1997.
The complete national collection has 4 datasets per state, 2 based on 1990 intercensal estimates and 2 based on post-2000
estimates. State and county level numbers with detail by single years of age, race, sex and hispanic origin. Great raw data resource for demographer types.
moprojs/
Missouri population projections at the state and county level out to the year 2030. Done in 2008 by the state demographer in the Office of Administration in Jefferson City. These projections were done using the latest census results and estimates then available. These numbers represent the "preferred" series.
popests2/
More estimates, but these are older and of interest mostly for for historical or trend analyses. Most were
released by the Census Bureau during the 90's and contain data estimated as of some year or years
within that decade.
saipe/
Small Area Income & Poverty Estimates . See the Economic Indicators section.
(Does include some population estimates as well.)
Economic Indicators
[Return to top of page]
beareis/
Files from the Bureau of Economic Analysis
Regional Economic Information System (REIS). Contains time series data
on employment, income, farm income, transfer payments and an overall
economic profile for all states and counties in the U.S. A U.S. file
contains summaries for the nation and BEA regions. Updated each summer.
There is usually a 2-year lag in getting these data out. In April 2008 we completely
replaced the data collection with new data sets rebenchmarked and with new data
for 2006.
bls_la/
These are (un)employment statistics from the Bureau of Labor Statistics -- their "la" (local area) series data. We have significantly restructured the data and have added badly needed FIPS state and county codes to make the data mergeable with other statistics. Monthly and annual average employment, unemployment and unemployment rate data back to 1990 for all US (+PR) states and counties. Data goes back to 1976 for states and includes seasonally adjusted data at state level. These datasets will be updated periodically, at least once a year. (Last updated 2/07 with data thru 12/06.)
saipe/
These are small area (county and school district level) income and poverty estimates from the
HHES group at the Census Bureau. These are inter-censal estimates generated using complex
statistical methodologies. The latest estimates tend to be about 2 or 3 years old. Data are
for usually for the entire U.S.
cbp/
County Business Patterns
empwage/
Employment and wage data for Missouri. Based on ES-202 files for the
state. County level summaries for various years, Missouri only.
taxsales/
Taxable sales for Missouri counties by SIC by quarter, starting with
year 2000. Data is from the Missouri Dept of Revenue. Lots of
suppression here when you look for detailed SIC info, but the data for
total sales without SIC detail is there. One dataset per year at the
county/SIC level, and then a single summary set with just total sales by
county by year (with state totals as well). This filetype replaces the
old mosals type that we had with data from the mod to late 90's.
bankdeps/
Banking Deposits data for Missouri. Data are for individual branches with names, addresses and estimated deposit info from
the FDIC. The data have also been aggregated to Missouri counties and the state. Data are available for 1999 thru 2007.
Geography/GIS
[Return to top of page]
georef/
Extensive collection of geographic reference data. A mixture of national and Missouri-specific files.
See related filetypes corrlst and mable2k (next entries) .
corrlst/
These are our geographic "correlation list" (aka "equivalency file")
datasets. They deal with how various geographic layers correspond to
each other. For example, how ZIP codes correlate with Congressional
Districts. Included (as a subdirectory) is the MABLE database used in
the MABLE/Geocorr web application. Many of these datasets (and
many more like them) can be generated using the MABLE/Geocorr
dynamic web application. Where many of these datasets may only be for
Missouri and neighboring states, MABLE/Geocorr works for the entire
country.
mable2k/
Master Area Block Level Equivalency files. This is the
database constructed for use in the Mable/Geocorr2k web application. It
is a distillation of the information contained in the geographic headers
files from Summary File 1, 2000 census. with some augmentations based on more recent TIGER line files and various other geographic sources such as CBSA codes for counties. The Missouri datasets have some extra
codes not available for the rest of the country. Most users will want to
access these data using the web application at http://mcdc2.missouri.edu/websas/geocorr2k.html.
mable98/
Similar to the mable2k data collection, but this is the
previous edition with older geographic codes. You can access these using
the original 1990 version of MABLE/Geocorr at http://mcdc2.missouri.edu/websas/geocorr90.shtml.
gics90/
1990 Geographic Identification Code Scheme (Census). Nice reference sets for basic geographic
entities as defined for the 1990 decennial census.
Compendia [Return to
top of page]
mosenior
The Missouri Senior Report, with its in-depth
analysis and county level ranking of the state of the state's elderly population.
OSEDA (the Office
of Social and Economic Data Analaysis at the University of Missouri Columbia,
an MCDC core agency) was
reponsible for the data and web site development on this project. The data used in the web site reports are
stored as part of the MCDC's data archive. You can access these data in the
mosenior data directory.
(See links under Data and Maps/Download Data Files on the MO Sr Report web site).
cntypage
The Missouri County Summary of Social and Economic Indicators was developed by OSEDA in collaboration with University of Missouri Extension
personnel in the fall of 2005. Includes key indicators used by Extension personnel. It data from the 2000 decennial
census, as well as the latest population estimates, current housing unit estiamte, key employment and personal income categories from BEA, and a host of other ites related to children (Kids Count indicators), family and health status indicators. Geographic summary units are the state, its counties and the UM Extension regions.
See the Missouri County Summary of Social and Economic Indicators web site for access to these data as formatted reports.
indctrs/
Key indicators database. Important collection of datasets that have
been created mostly by extraction of key data items from other sets in
this archive. Emphasis is on data for Missouri (the state, its counties
and various regions), and most have data for at least two points in
time. This collection of data is the basis of all reports and analyses
published by OSEDA on their web site (starting in 2002.)
kidscnt/
Kids Count is a national program sponsored by the Annie E. Casey
foundation. The data for Missouri (all we have in the archive) comes from a myriad of sources, mostly
within state government. They are all collected here and are used as the
source of the tables / charts / reports / maps etc that can be accessed
at the Missouri Kids Count Data Book Online web site.
srcount/
Senior count. This is a similar concept to Kids Count (kidscnt) but
with the focus is on the older population. We have thus far taken data mostly from the
1990 and 2000 decennial censuses. (Not to be confused with the more recent alternative mosenior filetype, which will most likely make this collection (srcount) obsolete.)
desex/
Demographic indicators extracted from the 1990 & 2000 censuses and other public sources created
specifically for the Missouri Dept of Elementary and Secondary Education (DESE). These data
are the basis for the DESE Socio-Economic Indicator Resource
web application. Summaries at various geographic levels, most of them within School District.
modotx/
The MoDOT SEIR extract was created for the Mo. Dept of Transportation for use in their
Socio-Economic Indicator Resource web-based
system (developed for MoDOT by OSEDA.) Geography is geared toward MoDOT apps but includes
RPCs, counties and places.
Other
[Return to top of page]
irsmig/
County level migration data based on IRS tax returns. We have data for the entire U.S. based on tax years
as early as 1999/2000 and as recent as 2004/2005. More data (for prior tax years back to 1983) are available at OSEDA and could be added to archive if users indicate interest.
ag2002/
Census of Agriculture, 2002. Only a few selected Tables have been converted and
placed in the MCDC Library so far. More to come if the folks at USDA ever make good on their
promise to make the data available in a resonable format for importing into our
data archive. Until then go to their web site at
www.nass.usda.gov/census/census02/.
movoters/
Data regarding Missouri voters and voting results. Most (currently all) data come from the
Missouri Secretary of State's office. There are no data on individuals that are publicly accessible in this directory.
Note:If you know what filetype you want you can
explore the data by accessing the main data directory (sorted alphabetically by filetype) at:
http://mcdc2.missouri.edu/cgi-bin/uexplore?/pub/data/
Important Note:
This application is intended for use by people wanting to
access data by querying a database. It is most commonly and
successfully used by members of the MCDC core group/affiliate
network, by "power users" who are comfortable accessing and manipulating machine-readable data files,
or by end users being guided to appropriate resources (directories or
files) by MCDC staff or other knowledgeable intermediaries. It does NOT (generally)
provide direct, easy access to processed data in the form of reports, charts, maps or other custom data products. Instead, it provides access to the raw data which we and others use to create such prodicts. Users
looking for those kinds of custom products/interfaces should consider other links on
the OSEDA or MCDC home pages, or perhaps on the MCDC Site Map page. First time users of this application should read
the Uexplore Application Description (see link at top of page) to
decide if this is the sort of access they are looking for. If so, they
should then spend some time looking at the xsamples (annotated examples of sample extracts) and/or the On-line Tutorials. Again, links to these and other relevant material are at the top of this page.
[Return to top of page]
|
|