The data here recreate the 2016 analysis from Vox, modeled off of the Washington State Department of Health’s analysis, to derive lead exposure risk scores. Scores are derived from tract-level Census Bureau data for “poverty status in the last 12 months” (Table S1701) and “year structure built” (Table B25034). The original analysis used 2014 ACS data, but my analysis here uses the most recent ACS data, from 2019.
The key estimates in this dataset are a lead risk index and percentile score ranging from 1-10, with 1 indicating very little lead exposure risk (tract is in the bottom 10 percent for risk based on nation-wide ranking) and 10 indicating very high lead exposure risk (tract is in the top 10 percent for risk based on nation-wide ranking). These lead risk scores and ranks are based on the age of housing and poverty status, which have been shown to influence the risk of lead poisoning in children.[^1]
The lead exposure risk is estimated, first, as a function of the age of the housing stock times a nationally-derived estimate of the proportion of housing from each era with lead risks (see Jacobs et al (2002)):
The resulting values are summed within census tracts and the sum is divided by the total number of households to generate housing risk.
Lead exposure risk is also higher in environments characterized by poverty. Poverty-induced risk is calculated by the the number of people under 125% of the poverty line divided by the total number of people within a tract.
The housing-induced risk and poverty-induced risk values are standardized to generate comparable scales and combined with additive weights to create the lead exposure risk score
\[lead\; exposure\; risk = housing\; risk \times 0.58 + poverty\; risk \times 0.42 \] The lead exposure risk score was ranked across all tracts in the US and each tract was assigned to a decline between 1 and 10. The data provided here is only for the Charlottesville area so that the risk rankings are relative to the nation as a whole.
glimpse(leadrisk)
## Rows: 50
## Columns: 8
## $ GEOID <chr> "51065020200", "51003010201", "51003010202", "510030…
## $ NAME <chr> "Census Tract 202, Fluvanna County, Virginia", "Cens…
## $ countyfips <chr> "065", "003", "003", "003", "003", "003", "003", "10…
## $ tractfips <chr> "020200", "010201", "010202", "010401", "010402", "0…
## $ leadriskscore_raw <dbl> -0.14187013, -1.07303392, -0.79073840, -0.37250423, …
## $ lead_risk_rank <dbl> 5, 1, 2, 4, 1, 3, 1, 2, 2, 3, 5, 5, 8, 10, 6, 5, 2, …
## $ housing_risk <dbl> 24.440575, 6.057143, 11.327447, 18.447356, 5.575397,…
## $ poverty_risk <dbl> 9.138218, 2.379958, 4.817329, 9.311255, 3.433333, 19…
Observations are census tract estimates of:
leadrisk %>% select(-c(GEOID:NAME)) %>%
select(where(~is.numeric(.x) && !is.na(.x))) %>%
as.data.frame() %>%
stargazer(., type = "text", title = "Summary Statistics", digits = 0,
summary.stat = c("mean", "sd", "min", "median", "max"))
##
## Summary Statistics
## ==============================================
## Statistic Mean St. Dev. Min Median Max
## ----------------------------------------------
## leadriskscore_raw -0 1 -1 -0 2
## lead_risk_rank 4 3 1 4 10
## housing_risk 13 8 2 11 33
## poverty_risk 18 17 2 13 95
## ----------------------------------------------
leadrisk %>% select(-c(NAME, countyfips, tractfips)) %>%
pivot_longer(-GEOID, names_to = "measure", values_to = "value") %>%
mutate(measure = factor(measure, levels = c("lead_risk_rank", "leadriskscore_raw", "housing_risk", "poverty_risk"))) %>%
ggplot(aes(x = value, fill = measure)) +
geom_histogram() +
facet_wrap(~measure, scales = "free") +
xlab("Risk Scores") +
guides(fill = "none")
pal <- colorFactor("viridis", reverse = TRUE, domain = leadrisk$lead_risk_rank)
leaflet(leadrisk) %>%
addProviderTiles("CartoDB.Positron") %>%
addPolygons(data = leadrisk,
fillColor = ~pal(lead_risk_rank),
weight = 1,
opacity = 1,
color = "white",
fillOpacity = 0.6,
highlight = highlightOptions(weight = 2, fillOpacity = 0.8, bringToFront = T),
popup = paste0(leadrisk$NAME.y, "<br>",
"Lead Risk Rank: ", leadrisk$lead_risk_rank)) %>%
addLegend("bottomright", pal = pal, values = leadrisk$lead_risk_rank,
title = "Lead Risk Rank", opacity = 0.7)
pal <- colorNumeric("viridis", reverse = TRUE, domain = leadrisk$leadriskscore_raw)
leaflet(leadrisk) %>%
addProviderTiles("CartoDB.Positron") %>%
addPolygons(data = leadrisk,
fillColor = ~pal(leadriskscore_raw),
weight = 1,
opacity = 1,
color = "white",
fillOpacity = 0.6,
highlight = highlightOptions(weight = 2, fillOpacity = 0.8, bringToFront = T),
popup = paste0(leadrisk$NAME.y, "<br>",
"Raw Lead Risk Score: ", round(leadrisk$leadriskscore_raw, 2))) %>%
addLegend("bottomright", pal = pal, values = leadrisk$leadriskscore_raw,
title = "Raw Lead Risk Score", opacity = 0.7)
pal <- colorNumeric("viridis", reverse = TRUE, domain = leadrisk$housing_risk)
leaflet(leadrisk) %>%
addProviderTiles("CartoDB.Positron") %>%
addPolygons(data = leadrisk,
fillColor = ~pal(housing_risk),
weight = 1,
opacity = 1,
color = "white",
fillOpacity = 0.6,
highlight = highlightOptions(weight = 2, fillOpacity = 0.8, bringToFront = T),
popup = paste0(leadrisk$NAME.y, "<br>",
"Housing Risk: ", round(leadrisk$housing_risk, 2))) %>%
addLegend("bottomright", pal = pal, values = leadrisk$housing_risk,
title = "Housing Risk", opacity = 0.7)
pal <- colorNumeric("viridis", reverse = TRUE, domain = leadrisk$poverty_risk)
leaflet(leadrisk) %>%
addProviderTiles("CartoDB.Positron") %>%
addPolygons(data = leadrisk,
fillColor = ~pal(poverty_risk),
weight = 1,
opacity = 1,
color = "white",
fillOpacity = 0.6,
highlight = highlightOptions(weight = 2, fillOpacity = 0.8, bringToFront = T),
popup = paste0(leadrisk$NAME.y, "<br>",
"Poverty Risk: ", round(leadrisk$poverty_risk, 2))) %>%
addLegend("bottomright", pal = pal, values = leadrisk$poverty_risk,
title = "Poverty Risk", opacity = 0.7)
{^1]: There are other factors that influence lead exposure, but it is often hard to get good data on those factors.