Analyzing Voter Turnout Data with R

Part 4: Areal Interpolation


Data for Democracy, Fall 2024
Andy Lyons

https://ajlyons.github.io/dfd2024/




Outline


Areal Interpolation

Areal interpolation is the process making estimates from a source set of polygons to an overlapping but incongruent set of target polygons.

Also called spatial resampling.

The interpolated values are typically counts or percentages.

Resampling requires an assumption about the distribution of values within the source polygons (i.e., uniformly distributed).


areal

areal (Chris Prener) is an R package tailored to perform areal interpolation.


Highlights

For population weighted areal interpolation (i.e., Binary dasymetric method), see tidycensus::interpolate_pw().

Input Data

Source - sf polygon object, preferably projected, with column(s) that you want estimated for the target polygons.

Target - sf polygon object, same CRS as the source.

Source and target polygons should overlap.


The source and target layer should both have a column with unique values (i.e., primary key)

Don’t include extraneous columns in the source that you don’t need in the target

Use ar_validate() to validate that your source and target objects are ready-to-go


aw_interpolate()

The main function in areal is:

aw_interpolate(.data, tid, source, sid, extensive, intensive, weight, output)


If the source and target polygons overlap completely (excluding small insignificant differences), use weight = 'sum'.

If the source and target polygons don’t cover the same area, use weight = 'total'.

Words of Wisdom

The path of areal interpolation, more work and assumptions you carry. Follow only if you must.

Make sure you really have to do your own areal interpolation.

The Census Bureau summarizes a lot of census data by a lot of geographies.

Other data publishers commonly summarize their data by census enumeration areas and administrative boundaries.


Exercise 4: Areal Interpolation

In this exercise, we will:

  1. import the 2020 voting tabulation districts (VTDs) for Camden County NJ from a Shapefile
  2. import a CSV with voter turnout data from the 2020 primary election
  3. join the tabular voter turnout data to the VTD polygons
  4. save the data to disk (for use in other exercises)
  5. map the voter turnout for the July 2020 primary
  6. reshape the attribute table from a wide to long format
  7. create facet maps (i.e., one map for each subset of the data)


https://posit.cloud/content/8521414