As we’ve noted in our basic overview of ZIP Codes, they are identical to the U.S. Census Bureau’s ZIP Code Tabulation Areas or other geographies. We therefore use crosswalk files to convert ZIP Codes to these other identifiers.
zippeR
provides an interface for accessing the former
UDS Mapper project’s ZIP to ZCTA crosswalk files](http://web.archive.org/web/20231218141557/https://udsmapper.org/zip-code-to-zcta-crosswalk/).
Crosswalk files are critical because not all ZIP codes are in the exact
same ZCTA. The UDS files are available from 2010 through 2021 in a
standardized format:
> zi_load_crosswalk(year = 2020)
# A tibble: 41,096 × 6
ZIP PO_NAME STATE ZIP_TYPE ZCTA zip_join_type
<chr> <chr> <chr> <chr> <chr> <chr>
1 00501 Holtsville NY Post Office or large volume customer 11742 Spatial join to ZCTA
2 00544 Holtsville NY Post Office or large volume customer 11742 Spatial join to ZCTA
3 00601 Adjuntas PR ZIP Code Area 00601 ZIP Matches ZCTA
4 00602 Aguada PR ZIP Code Area 00602 ZIP Matches ZCTA
5 00603 Aguadilla PR ZIP Code Area 00603 ZIP Matches ZCTA
6 00604 Aguadilla PR Post Office or large volume customer 00603 Spatial join to ZCTA
7 00605 Aguadilla PR Post Office or large volume customer 00603 Spatial join to ZCTA
8 00606 Maricao PR ZIP Code Area 00606 ZIP Matches ZCTA
9 00610 Anasco PR ZIP Code Area 00610 ZIP Matches ZCTA
10 00611 Angeles PR Post Office or large volume customer 00641 Spatial join to ZCTA
# … with 41,086 more rows
As with the three-digit ZCTA geometry, users should evaluate these data carefully before using them to ensure they are fit for purpose. In particular, they should note that ZIPs that do not have corresponding ZCTAs (such as Armed Forces mailing ZIPs and those in some overseas territories) are not included. Users should also remember that individuals may live in a different ZCTA from their mailing address when that address is a Post Office or some other large volume customer.
They can be used with zi_crosswalk()
to convert given
ZIP codes to ZCTAs:
> zips <- data.frame(id = c(1:3), ZIP = c("63139", "63108", "00501"))
> zi_crosswalk(zips, input_zip = ZIP, dict = "UDS 2021")
# A tibble: 3 × 3
id ZIP ZCTA
<int> <chr> <chr>
1 1 63139 63139
2 2 63108 63108
3 3 00501 11742
If "UDS 2021"
(or any other year between 2009 and 2023)
is given for dict
, zi_crosswalk()
will
automatically download the corresponding UDS crosswalk file. A custom
crosswalk can also be supplied for dict
in lieu of using
the UDS data, including a crosswalk created from
zi_load_crosswalk()
using HUD data. In that case,
dict_zip
and dict_zcta
should be updated to
correctly match input variable names. style
can also be
used if the custom dictionary contains three digit ZCTAs instead. If no
custom dictionary is supplied, zi_crosswalk()
will try to
convert the dictionary’s five-digit ZCTAs to three-digits:
The U.S. Housing and Urban Development (HUD) Department provides ZIP code to Census geography crosswalks that can be used to convert ZIP codes to Census Tracts, counties, and other geographies. These data are available through the HUD User website. Unlike the UDS files, ZIP Code Tabulation Areas are not one of the geographies including. If HUD data are used, be aware of ZIP Codes mapping into multiple Census Tracts, counties, etc. Many users may want to pick a “most likely” county (or other Census geometry) based on the proportion of commercial or residential customers.
To use the HUD data, users must first obtain an API key from the HUD User
website. Once you have an API key, they can use
zi_load_crosswalk()
to download the data either by passing
the key directly to the function or by storing the key in their .Rprofile
under the object name hud_key
:
The key can also be passed to zi_load_crosswalk
directly
with the key
argument:
> zi_load_crosswalk(zip_source = "HUD", year = 2023, qtr = 1, target = "COUNTY",
+ query = c("63138", "63139"))
# A tibble: 3 × 8
ZIP GEOID RES_RATIO BUS_RATIO OTH_RATIO TOT_RATIO CITY STATE
<chr> <chr> <dbl> <dbl> <int> <dbl> <chr> <chr>
1 63138 29189 0.999 0.988 1 0.999 SAINT LOUIS MO
2 63138 29510 0.000518 0.0124 0 0.000956 SAINT LOUIS MO
3 63139 29510 1 1 1 1 SAINT LOUIS MO
Queries can be either a single ZIP Code, a vector of ZIP Codes, a
state abbreviation, or the word "ALL"
to download the
entire crosswalk file. Using states or "ALL"
is available
from the 1st quarter of 2021 onwards. The target
argument
can be set to “COUNTY”, “TRACT”, “CBSA”, “CBSADIV”, “CD”, or
“COUNTYSUB”. The year
and qtr
arguments
specify the year and quarter of the data to download.
Note that the above query finds that the ZIP Code 63138
straddles two counties, but the vast majority of both residential and
commercial customers are in St. Louis City (GEOID
is
29510
). If you were building a crosswalk file from these,
you might want to select St. Louis City as the “most likely” county for
ZIP Code 63138
. The
Since using the HUD data requires a number of analytic choices, it
cannot be accessed directly through zi_crosswalk()
.
Instead, you should construct the desired crosswalk file yourself and
then pass it to zi_crosswalk()
as a custom dictionary. The
zi_prep_hud()
function can help you prepare the HUD data
for use in joins:
# access to HUD ZIP Code to County crosswalk for all ZIP Codes in Missouri
mo <- zi_load_crosswalk(zip_source = "HUD", year = 2023, qtr = 1,
target = "COUNTY", query = "MO")
# prep data
mo <- zi_prep_hud(mo, by = "residential")
The resulting output contains one row of data for each ZIP Code matched with the county that has the highest proportion of residential ZIP Codes. Users can also construct a crosswalk using commercial addresses or total addresses. When used with multiple states, if the ZIP Code straddles two states, two records will be returned.