Missouri Census Data Center

Dexter Xsample: Detailed Query Description

Query ID: PoorestCounties

Summary

Accesses a standard extract data set for the 2000 census, Summary File 3, with state and county-level data for the whole country (including Puerto Rico). It selects all counties that had a poverty rate of at least 25%, but excludes the county equivalents in PR. We chose the 25% threshold after looking at a prelinary extract and knowing that we wanted about 200 counties (we got 203). We keep 3 poverty count variables and the corrsponding percentages. Output is a plain text report file with custom title, subtitle and footnote.

Filetype accessed: sf32000x

Data Set accessed: usstcnty - chose because we wanted data for the entire U.S. with county level summaries.

Dexter features used:

  • A 3-condition filter in Section II
  • Section IV is featured, with all 4 options filled in, including the sort
  • we use the "easy" option in Section 5 to get better column headers in the report.

Can be easily modified to: Raise or lower the poverty threshold for inclusion in the report. Keep more variables and output to a data set for subsequent analysis to see what other demographic indicators tend to go along with high rates of poverty (e.g. variables such as PctUrban, PctBlack, PctOver65, etc.)

Degree of Difficulty: 3 - moderate.

Saved Query File: View It      Run It


Annotated Dexter Query Form (As filled out to define this query) Section I: All we do in Section I is check that we wanted no Delimited File and that we wanted a Plain text report to be generated instead. Note that we could have (but did not) check the box at the bottom saying we wanted the report to be piped directly to our browser. That would eliminate the generation of the blue Output Menu screen with the link that takes us to the report file.

Screen image of dexter query form, part 1

Section II: We have 3 conditions for our data filter.

  1. We only want county level summaries (the data set we are accessing also contain state level summaries which we want to ignore). The value of the SumLev variable indicating a county level summary is 050. Note the leading "0", which is required. If you did not know what the codes were for this variable you could follow the link to Detailed Metadata at the top of the form and then follow the link to the SumLev values page from the section on Key Variables.
  2. When we first ran this query we found that most of the poorest counties we were getting were county equivalents in Puerto Rico. We decided to exclude Puerto Rico from this extraction. We could have used a filter of State Not Equal To 72 here but decided to use the Stab (state postal abbreviation) variable instead. Note that case does not matter for the Value -- dexter automatically converts character Values entered to uppercase and compares them with an uppercased value of the variable (you can override this, but you rarely need or want to).
  3. The final condition specifies that we only want to look at counties with a high poverty rate. The value 25 was actually arrived at by a bit of trial and error. We ran the query with a lower threshold (20) and saw from our sorted report that if we raised the threshold to 25 we would be capturing just over 200 counties and this is what we wanted. Note that the percentage figure is stored as 25 and not .25 .

Screen image of dexter query form, part 2



Section III: We only choose a single identifier, County. This variable is stored as a FIPS county code but displays as the county name (using a SAS format code, $county.) If we wanted to see the FIPS code for the county in the report we could select GeoCode from the Identifiers list. On the Numerics side we know we want the number of poor persons and the poverty rate (percentage of persons who are poor). But while we are focusing on those 2 items we see that there are other variables on the data set that are measure of poverty as well and would therefore be of some interest for our report. The variables on the data set are grouped together by subject, so once we scrolled down the list and found the Poor and PctPoor items, the related items were nearby and easily selected. We chose the VeryPoor and NearlyPoor variables along with the corresponding percentages. Note (from the position of the scrollbar on the Numeric select window) that we had to scroll down almost 3/4 of the way through the list of around 400 items. This can be a little tedious if you are not familiar with the data set and how the variables are ordered. There is a tool that will assist you in finding variables from long lists (100 or more variables) labeled "Filter by Regular Expression" which appears just below the select list. This xsample does not feature that capability but if you would like to learn more about how it works click on the the Section III header to access the Online help for this section and then look for the Filter by regular expression Processing topic.

Screen image of dexter query form, part 3



Section IV: As usual with this section the entries are pretty straightforward. The idea is to help your audience understand what they are looking at in the report by providing informative titles and footnotes. The Sort entry specifies that we want to have the report sorted by descending value of the variable pctpoor (the leading minus sign is used to indicate the descending sort).

Screen image of dexter query form, part 4



Section V: In the b. Advanced Report Formatting Options subsection we check the box that says to use variable labels (instead of variable names) as column headers in the report. This is a very common thing to want to do. Nothing really very advanced about it, but you have to use this section to specify it.

Screen image of dexter query form, part 5



Output Report: Not too many real famous or fmiliar counties (to most people, at least). Lots of areas with Indian reservations and rural southern areas. Starr, Texas (the largest of the counties shown in the top (or bottom, depending on POV) is in extreme southern Texas on the Mexican border.

Screen image of plain text output report



Exercises

  1. Modify the filter in Section II so that it selects only counties that are at least 75% urban.

  2. Create a pdf report that shows all counties in the United States (excluding PR) that have a poverty rate of 20 or more. Sort the report by State and then by descending PctPoor within state. Keep the following numeric variables:
    • TotPop - Total Population
    • PctUrban - Pct Urban Population
    • PctPoor
    • PctBlack1 - Pct Black alone
    • PctIndian1 - Pct American Indian, Alaska Native alone
    • PctHispPop - Pct Hispanic
    Keep State and AreaName as your Identifier variables. Type all these names into the text box in Sec. III rather than selecting them off the two select lists.

  3. Modify the previous report using the following Section V specs:
    • Specify that you want State used as a By variable in the report, with Areaname used as an ID variable for the report.
    • Use variable labels as column headers for the report.
    • To get State to display as the name of the state rather than the FIPS code enter the following spec in the Format text box of section V.c: state $state. .
    • Change the format of the report from pdf to html and have the results piped directly back to your browser with no Output Menu screen.


This file last modified Monday August 07, 2017, 13:14:34
The Missouri Census Data Center is a sponsored program of the Missouri State Library within the office of the Missouri Secretary of State. The MCDC has been a partner in the U.S. Census Bureau's State Data Center program since 1979.

Questions/Comments regarding this page or this web site can be sent to