我正试图绘制美国县级数据,但我不知道为什么有些县没有出现。在此玩具示例中,我仅关注加利福尼亚州的县,并且保留所有的每日数据,直到结束对ggplot()
的调用过滤掉为止(我的实际用例涉及gganimate,因此我需要每日数据)。
library(tidyverse)
library(sf)
library(viridis)
library("rio")
# get county geometry
url <- "https://gist.githubusercontent.com/ericpgreen/717596c37478ef894c14b250477fae92/raw/c2cf4b273a2c7f0677f22a37b5e9f7e893204e3b/cali.R"
cali <- rio::import(url)
# get covid data
covid <- read.csv("https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv",
stringsAsFactors = FALSE)
# prep covid data
covidPrepped <-
covid %>%
filter(state=="California") %>%
select(date, fips, cases, deaths) %>%
mutate(date = lubridate::ymd(date)) %>%
mutate(fips = stringr::str_pad(fips, width=5, pad="0")) %>%
mutate(month = lubridate::month(date,
label=TRUE,
abbr=TRUE),
day = lubridate::day(date),
monthDay = paste(month, day, sep=" "))
# make sure every county has a row for every day
complete <-
cali %>%
left_join(covidPrepped, by = c("GEOID" = "fips")) %>%
complete(date, GEOID, fill = list(cases = 0)) %>%
select(date, GEOID, cases, monthDay)
# join back to geometry and construct casesPop
pData <-
complete %>%
left_join(select(cali, GEOID, NAME, estimate, geometry),
by = "GEOID") %>%
st_as_sf() %>%
mutate(casesPop = (cases/estimate)*100000) %>%
mutate(casesPop = ifelse(is.na(casesPop), 0, casesPop)) %>%
mutate(group = cut(casesPop,
breaks = c(0, 1, 3, 10, 30, 100,
300, 1000, 3000, 10000,
Inf),
labels = c(0, 1, 3, 10, 30, 100,
300, 1000, 3000, 10000),
include.lowest = TRUE)
) %>%
select(GEOID, geometry, group, monthDay)
# plot
ggplot(pData %>% filter(monthDay=="May 5")) +
geom_sf(aes(fill = group), color = "white", size=.1) +
scale_fill_viridis_d(option = "magma", drop=FALSE) +
coord_sf(crs = 102003) +
theme_minimal() +
theme(legend.position = "top",
legend.box = "horizontal",
legend.title = element_blank(),
legend.justification='left') +
guides(fill = guide_legend(nrow = 1))
缺少县:
missing <- pData %>% filter(monthDay=="May 5")
cali$GEOID[!(cali$GEOID %in% test$GEOID)]
#[1] "06035" "06049" "06091" "06105"
这些县没有5月5日的covid
数据,但我认为可以通过致电complete()
来解决。
complete(date, GEOID, fill = list(cases = 0))