Examples of Using the MABLE/Geocorr WWW Application

Table of Contents:


Overview

The MABLE/Geocorr application allows you to create reports and/or comma-separated value ("csv") files showing various kinds of information about U.S. geography, especially census geography (i.e., things like census tracts and blocks -- defined by the U.S. Census Bureau for the purpose of tabulating the decennial census.)
Probably the most common application of geocorr is the creation of a correlation list report showing how one kind of geography relates to another. For example, you can create a report showing how ZIP codes correspond to counties and cities (places) for the state of Missouri (or just for specified counties in Missouri.) This is the "typical" use, there are several others which are really just special cases. For example, you can specify that the "target" geography -- normally, the set of geographic areas to which you want to compare the first, or "source" geography -- is the "Entire Universe"; in this case, what the application is doing is simply listing a series of geographic codes and related names and 1990 population counts for a specified area. It's a very easy way to list all the ZIP codes in Kansas City (for example.)

We'll present a set of simple examples to give you a flavor of the kinds of things that geocorr can do for you. You should spend some time with the extensive HELP module available for the application (its all on a single HTML file so it can be easily printed off and read off-line if you prefer.) If you are not familiar with one or more of the various geographic units shown in the geocode selection lists then you should follow the hyperlinks to the MAGGOT geographic glossary page. (Both the HELP and MAGGOT pages can be referenced from the row of reference links (beginning with "Examples") across the top of the main form.)

As you go though these examples we suggest you enter the specifications as described and run the application, so you can see for yourself what the results will look like. You might even want to get daring and create your own examples. After reviewing the results of a run, use the back key to return to the form page and try changing one of the parameters and running the request again to see what difference it makes.

It is important to note that while geocorr may appear to be overly complex for a casual user, the most typical applications (as illustrated, we hope, by these examples), require the user to specify only a small number of them. You can begin doing simple correspondences and geocode reference listings, and then when you master these basic applications, proceed to the more advanced options involving things such as circular areas, bounding boxes and the much misunderstood "Concentric Ring Pseudo-Geocodes". You can get 80% of the functionality of the application using about 20% of the options. Casual and first-time users should begin by focusing primarily on the Input Options and Output Options sections. They are the only required specifications.


Geocorr Example 1: Simple Correlation List

The "classic" application of this tool is to look at how one geographic layer (such as census tracts) relate to another layer (such as 5-digit postal ZIP codes) in some geographic universe (such as Missouri.) Make the following parameter selections (options flagged with a "(D)" represent default selections, meaning these will be chosen if you do not modify the selection that is preset by the application.) Hit one of the several "Reset Defaults" buttons prior to entering these specifications for a new Example in order to restore all the defaults. Note that normal action is for the application to remember what you have entered previously (during the same session - if you exit the application and come back in, then geocorr will not remember what you entered in any previous session). But if you run the application and then upon seeing the results decided you just want to change one or two little options, this you can easily do.

  Option/parm    Specify Value          Comments
  -----------    -------------          ----------
Input Options:
 state           Missouri               Use scroll bar to get to it in select box
                                        at top of Input Options section. You MUST
                                        select a state.

SOURCE geocodes County (1990) (D) Census Tract/BNA(1990) (D) if you select this without selecting "County" geocorr will select it for you since a tract/BNA without a county makes no sense.
TARGET geocodes ZIP code. (D) You should have lots of questions about this choice. What's the source? How current, complete? See the MAGGOT entry to get more information.
Weighting Var. Population (D) Shows up in output report and file.
Ignore zero ... (check it) You should almost always choose this.
Output Options: Have weighted.. (check it) See HELP for details. Puts an x-y coord. pair on each outline line/record.
Generate a CSV..(check it) (D) Program will know to generate a comma separated value file with "results". Leave the "Just Codes" selection in the select box for the CSV file.
Generate a listing file...........(check it) (D) Generates a report-format output file. This is what the human looks at. (The .csv file is what a program looks at.) Check on "Codes and Names" option here. Then you'll get names for the counties and the ZIP codes (but not for the tracts --census tracts are not named.)
Point-and-Distance Options: (skip)
Bounding Box Filter Options: (skip)
Geographic Filter Options: County codes text 019 This is the 3-digit FIPS county code for Boone co. Note that the "County codes" string is a hyper-link to code pages showing all FIPS county codes. Try it! Could also enter "29019", but since we have selected just one state we can get by with just the 3-digit county code.

Click on the "Run Request" button to initiate processing. A Perl script will examine what you entered on the HTML form to verify that there are no invalid or potentially dangerous characters being passed. It will then invoke SAS(r) and pass it the form information and tell it to run the special geocorr SAS program. Geocorr should take anywhere from a few seconds to several minutes to execute, depending on system load and on your request. This example should take about 20 seconds if there is a normal load on the system. The hour glass will appear while it is running. When it finishes you should be presented with a menu screen labeled "Results of Query". This menu page will list the 5 output files produced by the request with a brief description of what is contained in each. You can almost always ignore the first of these files (the SAS program log). The summary.log should always be checked for any warning or other messages regarding the query. It will provide information about when and where the application ran, what parameters were specified, the number of records written to the output files, how long it took, etc.

The important outputs are the geocorr.lst and geocorr.csv files. Click on each to view them in your browser. Use the browser to save them to a local file and/or to print them. Notice the dramatic difference in their formats, but the nearly identical nature of their contents. Because we asked for "codes and names" on the listing file and not on the CSV file, there will be some content difference in this case. But the basic data content is the same.

Look very carefully at the first two data lines (after the title and column header liens) of the listing file. These two lines have information about the first value of the "source" geocodes we requested -- the first county-tract. It shows each of the values for the "target" geocode(s) that intersect with this area. In the example, we see that tract 0001.00 is partly in ZIP 65201 and partly in 65203. The degree of the intersection is measured by the weighting variable, 1990 total population. This small tract has only 430 people in it, and of these, about 408 lived in 65201 in 1990 and the other 22 lived in 65203. The AFACT (allocation factor) column shows the decimal portion of the source area contained in the target area -- ".949" in the first line means that 94.9% of the tract is contained in the ZIP code for that line. This is based on 1990 population. If we had chosen Housing Units or land area for our weighting variable, we'd see different value for this factor.

Notice that many of the tracts appear on only one line -- they correspond entirely to a single ZIP code. And notice that the values of the AFACT column always sum to 1.0 for all the lines corresponding to one tract.

The columns labeled "INTPTLNG" and "INTPTLAT" are poorly named. These appear as a result of our checking the option to have "weighted centroids" calculated and kept on the output file(s). Where do these come from? The geocorr program is working with a database that has observations at the 1990 census block level. Each observation has these "internal point" coordinates indicating where the spatial centroid of the census block is located. When the program generates a line of the output files, it is really just combining information from all the blocks that are in the intersecting areas. The first line of the report comes from looking at all blocks that are in tract 0001.00 and in ZIP 65201 and summing the 1990 populations of those blocks. At the same time the program looks at the latitude-longitude coordinates of each block centroid and weights each by multiplying it by the population of that block. Prior to output after processing all blocks for a tract-ZIP pairing the program divides the weighted coordinate sums by the population total for the area, creating this "weighted centroid". This location is biased towards where the people actually live within the area, rather than just on the geometry of the census blocks (if land area is chosen for the weighting variable, then the resulting weighted centroids are more of a spatial center.)

The comma delimited (".csv") file can be browsed and then saved to your local disk with your browser's "save as" command. (You might even want to configure your browser to invoke a helper application to customize processing.)

After you save the file, you should be able to open it for processing by most spreadsheet and data base programs in Windows. Notice that the first line of the file contains the names of the fields -- when you import these data into Excel or Lotus you'll see that these names appear as the first row of the spreadsheet. To get a more detailed description of what these variables are you can browse the varlst.lst file, the last entry on your Query Results page. This is usually a very short file, and in most cases we ignore it (because we already know what the fields are -- but you may find it very helpful in trying to interpret what you have.)

Top

Geocorr Example 2: Bi-directional Correlation List

In this example, repeat all the options chosen in the 1st example, except for the following:

  Option/parm    Specify Value          Comments
  -----------    -------------          ----------
Input Options:
 State           Missouri and Illinois  Will need to hold down Ctrl key to
                                        select two items from the Select list
                                        (with most browsers, at least).

Source geocodes Place: city...
Target geocodes County Output will show places (cities) related to counties.
Weighting var Housing Units Instead of default, population.
Output Options: Weighted Turn off. We really don't use these too much. centroids
Generate AFACT2 Turn on. This will cause the program to do double work in terms "allocation factors". Now we get the portion of the source codes in the targets, and the portion of the targets in the source areas.
Generate CSV file Turn off. It'll run faster without it, so if you don't plan to read it with a program...
Geographic Filtering options: County codes. (blank) Will not be filtering at county level.
Metro areas 7040 Selects St. Louis MSA. To see all the metro codes you can enter note that the "Metro Area codes" heading is a hyperlink.
Hit one of the Run Request buttons to submit the new request. Follow the usual procedure to view your output elements by clicking on the filenames on the Query Results page. The key output is the listing file. What you should see if you entered the options as specified is a rather long report that lists all of the cities (places) in the St. Louis MSA (including the Illinois side.) It is not really much of a report in terms of showing any geographic correlation. Mostly, it simply tells you what county each place is located in. There are a few cases of a place being in more that one county, in which case it shows you what portion of the place is in each of the counties. Note that the value of AFACT2 represents the portion of the county that is in the place (so we see that about 14% of the population of Madison county, Ill is in the city of Alton.)

Remember that everything is frozen in the 1990 time-frame. The data you see for O'Fallon, Mo. is based on the boundary of that city as defined for the 1990 census; it is not the current definition of that place. Likewise, of course, the housing unit (weight variable) counts are from the 1990 census.

If you are familiar with the St. Louis metro area you might expect to find the cities of Troy and Warrensburg, Missouri in this report. These cities are in Lincoln and Warren county, which were added to the official metro area (MSA) definition in 1992. But the metro codes stored in the MABLE database are as of the 1990 census so these two counties will not be selected. You could fix this by going back and entering the FIPS codes for the two "missing" counties in the box provided for filtering by county.

Top

Geocorr Example 3: Listing of Codes With Pop Counts

While the primary purpose of geocorr is to look at the relationships between different geographic layers, it can also be quite useful as a tool for simply looking at a single geographic layer. In this example we use the "Entire Universe" option for the Target geocodes, essentially telling the program that we have no target codes. In this case our output will show simply the geographic codes and related names for the source geography, along with the value of the weighting variable (1990 Total Population in this example.) If necessary, hit a "Reset Defaults" button before starting these specs.

  Option/parm    Specify Value          Comments
  -----------    -------------          ----------
Input Options:
 state                      Remember, you MUST select at least one.
                                        You could, of course, select more than one.

SOURCE geocodes County (1990) Metro Area: ... You are requesting a listing of the counties (or county equivalents) and the corresponding MSA/CMSA areas. If you are unfamiliar with the MSA/CMSA concept go to the MAGGOT file and read the explanations there.
TARGET geocodes Entire Universe. Basically, this says you don't want any target layer(s): you just want to know about the source geographic areas in their entirety.
Weighting Var. Population (D) Shows up in output report and file.
Ignore zero ... (check it) You should almost always choose this.
Output Options: Generate a CSV..(uncheck it) Program will NOT generate a comma separated value file.
Generate a listing file...........(check it) (D) Generates a report-format output file. (You MUST check either the .CSV file option or this one - otherwise you have no output!)
Codes and Names..(select this) From the Select list for the listing file -- IMPORTANT for this kind of request. You want to see both the codes and the names associated with those codes.

Leave all other parameters and options unspecified.
Click on the "Run Request" button to initiate processing. Wait patiently for the "Results of Query" page to come back to you.

The important output is the geocorr.lst report file. Click on it to see the report. It should be sorted by (state and) county. The column labeled "COUNTY" contains the 5-digit FIPS code and the field labeled "COUNTYNM" has the name of the county (including the state abbreviation.) These columns are followed by the MSACMSA and MSANAME field with comparable data (code and name) for the metropolitan area. For counties or portions of counties (only in New England) falling outside any metropolitan area you'll see the code '9999' with "Non-metro" for the name. The POP column contains the 1990 complete count population for the county/metro area. For all but a few counties in New England this figure will represent the population of the entire county. The AFACT (allocation factor) column is a constant "1.000" as it always will be when "Entire Universe" is specified for the target geocode.

As an optional exercise for the more serious geocorr user you might try rerunning this request but with the following changes:

In this case what you will see is a report very much like the one you just generated in that it will be counties within metro areas on each line of the report. But the AFACT values will now almost all be less than 1.0. Do you understand why? If not, remember that AFACT is defined as the "portion of the area defined by the source geocodes contained within the area defined by the target areas". In this instance, it become the decimal fraction portion of the county population which is also included in the metro area. (In the original example, AFACT represented the portion of the county-metro combination that was contained with the Entire Universe.)

Printing the Report

The usual precautions about printing a document using a web browser apply here. You may need to tweak some of your options - such as your fixed font size - to get this report to print without truncation. In some rare cases you may need to bring the report down to a local file and use a word processor to format and print it just the way you like it. But unless you have specified a large number of geocodes, this will rarely be necessary. In most cases, simply using the PRINT button on your browser should display the report quite well.
Top

Geocorr Example 4: Population Within an N-mile Radius

In this example, we finally look at some of the options pertaining to the use of x-y coordinates for filtering the data. Specifically, what we'll want to do is determine the total 1990 population living within a 30-mile radius of the city of Washington, Mo. Actually, we'll break that population down by county. Hit the Reset Defaults button and lets start over with the options for this sample. Any option not mentioned, just leave it with the default setting.

  Option/parm    Specify Value          Comments
  -----------    -------------          ----------
Input Options:
 State           Missouri.              If the n-mile circle went outside the
                                        state we would not pick up those pops.

Source geocodes County When you choose county, state is implied (selected by the program.)
Target geocodes Entire Universe We're not really doing a "correlation list" in this example. We just want the sum of our weight variable - the 1990 total population - for the circle we'll specify.
Output Options: Select "Codes and Names" for the listing file. So you'll know what the counties are.
Point and Distance Options: Coordinates of point: Go down a little on the form to where Latitude: 38.545881 there are a series of links provided Longitude: 91.019346 to help you determine point coordinates. Click on the link to "Gazetteer" (at the Census Bureau). On the form presented enter the city as Washington, the state as Mo and the ZIP as 63090 (optional but helps). The application will present you with the coordinates (among other useful things such as links to maps and census data.) Write the coordinates down, hit "back" several times to get back to the Geocorr form and type in the coordinates. A leading "-" on the longitude is optional. West longitude is assumed.
Label of point Washington, Mo Not required but useful.
Value for Radius or Largest Ring 30 This means 30 miles. If you click the option box above it would mean 30 kilometers.

Hit the Run Request button to run the job. You have told geocorr to find 1990 census blocks whose centroids are within 30 miles of a specified point which we hope is near the center of the city of Washington, MO. We have specified that we want to look at the relationship of counties to the Entire Universe for this geographic area. If you read the fine print (the Note: at the bottom of the bottom of the Point and Distance Options section), you'll be told to expect some extra items in your report when you specify a point and radius. The intptlng and intptlat variables contain the weighted average of the block centroid coordinates for all census blocks that were aggregated to create the output summary line. These are of value only as a general indicator of the "center" of this geographic intersection. The distance variable is the distance (in miles or kilometers, depending on the option you selected on the form -- miles, in our example) between the specified point and (intptlat,intptlng). It thus represents sort of an "average" distance.

The POP item on the output file represents the sum of the block populations for all the census blocks used to create the geographic summary area. In this case the output line for Franklin county has a POP figure that is the total of 1990 population for all blocks that are both in Franklin county and within 30 miles of our point. To get the overall total population for the 30-mile circle we shall need to add all POP figures from our output report. Or, we could go back and rerun the application and choose STATE instead of COUNTY as our source geocode; then we would get only a single output line -- the 30-mile circle intersected with the state.

Top

Geocorr Example 5: Block Groups in a 1, 3 and 5-mile Radius

A very common request is to determine demographic profiles of circular areas about a given location (typically, the location is an existing or proposed site for a business, school, service center, etc.) In this example, we see how MABLE/Geocorr can be used to extract the required block group geography that will tell you what geographic areas you will have to aggregate to get your demographic profile. M/G does not (currently) link to any detailed demographic data and does not produce any profiles, but it does provide you with a key component of such an application by selecting the geographic units for the circular or ring ("donut") areas.
In this example, we'll determine the latitude,longitude coordinates of the UM St. Louis (UMSL) campus in St. Louis county, MO. We'll generate a CSV file containing all the block group codes within a series of concentric "rings" about that site.
Hit the Reset Defaults button and let's start over with the options for this sample. As usual, any option not mentioned should be left with its default setting.

  Option/parm    Specify Value          Comments
  -----------    -------------          ----------
Input Options:
 State           Missouri.              Maybe we should make this the default.

Source geos Block group County and tract will also be selected for you.
Target geos Concentric Ring.. Geocorr will assign the "ring" code dynamically based on x-y coordinates and series of ring values specified below.
Ignore block.. Turn on. Almost always saves time to ignore blocks with no "weight".
Output Options: Generate listing. Turn off. Only want CSV file, do not want report.
Sort by target Turn on. So the output is sorted by the ring geocodes, then.. numbers first, then by block group.
Use tabs on CSV Turn on. Output file will have tabs between fields instead of commas.
Point and Distance Options: Coordinates: latitude 38.70763 longitude -90.31118 Leading "-" is OK, but not required. So how did we get this? Go down to the Yahoo Map Server link, just below. Click on it & wait for page to appear. Type in 8001 Natural Bridge Rd in first box. Then type St. Louis, Mo 63121 in City, State box. It may take a minute for map to appear. Put cursor over "Print Preview" and read coordinates from the URL that appears in the box at the bottom of the browser window. Do NOT click the button. This is a secret trick. It also works for street intersections ("6th & Locust") or just city names ("Cool Valley, MO" with address box left blank.) ZIP code is optional.
Label of point UMSL. University of Missouri St. Louis.
Custom list of ring radii #1: 1 #2: 3 #3: 5 Fill in the ring radii in ascending order. Note that we do NOT enter the radius value or the "# of equi-distant rings".

Hit the Run Request button. Wait... Browse the summary.log and geocorr.csv files to see the results. Note how the fields "line up" when browsing the CSV file with tabs as delimiters. Also browse the very small varlist.lst file to see what it contains. Basically, just labels for the variables on the CSV file. Print it if you need it for documentation. Notice how the point label "UMSL" appears in label for the distance variable.

If you were going to do this a lot or you were in a big hurry you could make this example run somewhat faster by telling the program which county or counties your circles fall within. In this case, we could have gone down near the bottom of the form to the Geographic Filtering Options section. There, in the text box for the County codes we could have entered
189 510
to specify that we wanted to restrict our query to the 2 counties with these FIPS codes. These are the codes for St. Louis county and St. Louis city (which we could have "looked up" using the link there had we not already known these.) Do not do this, of course, unless you are sure that your circle will not go beyond the counties entered (or unless you don't care and want to limit the search to these counties anyway.)

A typical use of such an output file would be to save it from your browser to a file and then to bring it into another program where you would use it to select all the block groups it contains from a data extract file (from STF3, for example). Then you could sum the numbers for those block groups (multiplied by the values of the AFACT variable to "allocate" the data when a BG is in more than one ring.) We hope some day to enhance this application to allow this kind of post-processing to be integrated into a system which uses the geocorr application.

Top

Last Update: 06-10-97