我在德克萨斯州有一个学区的shapfile,我正在尝试使用ggplot2
突出显示10。我已经修改了它并完成了所有设置,但是当我发现它时,我意识到突出显示的10个区域实际上并不是我想要突出显示的区域。
shapefile可以从此链接下载到Texas Education Agency Public Open Data Site。
#install.packages(c("ggplot2", "rgdal"))
library(ggplot2)
library(rgdal)
#rm(list=ls())
#setwd("path")
# read shapefile
tex <- readOGR(dsn = paste0(getwd(), "/Current_Districts/Current_Districts.shp")
# colors to use and districts to highlight
cols<- c("#CCCCCC", "#003082")
districts <- c("Aldine", "Laredo", "Spring Branch", "United", "Donna", "Brownsville", "Houston", "Bryan", "Galena Park", "San Felipe-Del Rio Cons")
# extract from shapefile data just the name and ID, then subset to only the districts of interest
dist_info <- data.frame(cbind(as.character(tex@data$NAME2), as.character(tex@data$FID)), stringsAsFactors=FALSE)
names(dist_info) <- c("name", "id")
dist_info <- dist_info[dist_info$name %in% districts, ]
# turn shapefile into df
tex_df <- fortify(tex)
# create dummy fill var for if the district is one to be highlighted
tex_df$yes <- as.factor(ifelse(tex_df$id %in% dist_info$id, 1, 0))
# plot the graph
ggplot(data=tex_df) +
geom_polygon(aes(x=long, y=lat, group=group, fill=yes), color="#CCCCCC") +
scale_fill_manual(values=cols) +
theme_void() +
theme(legend.position = "none")
正如您所看到的,当情节被创建时,看起来它完全符合我的要求。问题是,在上面的districts
向量中突出显示的十个区域并不是那些区域。我已经多次重新运行所有内容,仔细检查我没有因素/字符转换问题,并在Web数据资源管理器中仔细检查我从shapefile获取的ID确实是那些应该匹配的ID用我的名单。我真的不知道这个问题可能来自哪里。
这是我第一次使用shapefile和rgdal
,所以如果我不得不猜测我不理解的结构有一些简单的东西,希望你们中的一个能够快速为我指出。谢谢!
这是输出:
答案 0 :(得分:1)
备选方案1
使用fortify
函数添加参数region
指定&#34; NAME2&#34;,列ID将包含您的区域名称。然后根据该列创建虚拟填充变量。
我不熟悉德克萨斯州的区域,但我认为结果是对的。
tex <- tex <- readOGR(dsn = paste0(getwd(), "/Current_Districts/Current_Districts.shp"))
# colors to use and districts to highlight
cols<- c("#CCCCCC", "#003082")
districts <- c("Aldine", "Laredo", "Spring Branch", "United", "Donna", "Brownsville", "Houston", "Bryan", "Galena Park", "San Felipe-Del Rio Cons")
# turn shapefile into df
tex_df <- fortify(tex, region = "NAME2")
# create dummy fill var for if the district is one to be highlighted
tex_df$yes <- as.factor(ifelse(tex_df$id %in% districts, 1, 0))
# plot the graph
ggplot(data=tex_df) +
geom_polygon(aes(x=long, y=lat, group=group, fill=yes), color="#CCCCCC") +
scale_fill_manual(values=cols) +
theme_void() +
theme(legend.position = "none")
备选方案2
不将参数区域传递给fortify
函数。解决seeellayewhy的问题,实施以前的替代方案。我们添加了两个层,无需创建虚拟变量或合并任何数据帧。
tex <- tex <- readOGR(dsn = paste0(getwd(), "/Current_Districts/Current_Districts.shp"))
# colors to use and districts to highlight
cols<- c("#CCCCCC", "#003082")
districts <- c("Aldine", "Laredo", "Spring Branch", "United", "Donna", "Brownsville", "Houston", "Bryan", "Galena Park", "San Felipe-Del Rio Cons")
# Subset the shape file into two
tex1 <- subset(tex, NAME2 %in% districts)
tex2 <- subset(tex, !(NAME2 %in% districts))
# Create two data frames
tex_df1 <- fortify(tex1)
tex_df2 <- fortify(tex2)
# Plot two geom_polygon layers, one for each data frame
ggplot() +
geom_polygon(data = tex_df1,
aes(x = long, y = lat, group = group, fill = "#CCCCCC"),
color = "#CCCCCC")+
geom_polygon(data = tex_df2,
aes(x = long, y = lat, group = group, fill ="#003082")) +
scale_fill_manual(values=cols) +
theme_void() +
theme(legend.position = "none")
答案 1 :(得分:0)
当试图实现@mpalanco将{region}参数添加到fortify()
函数的解决方案时,我得到了一个错误,我可以通过许多其他堆栈帖子(Error: isTRUE(gpclibPermitStatus()) is not TRUE
)来解决。我也尝试使用broom::tidy()
,它是fortify()
的非弃用的等价物并且具有相同的错误。
最终,我最终从here实施了@ luchanocho的解决方案。我不喜欢它使用seq()
生成ID的事实,因为它不一定保留正确的顺序,但我的情况很简单,我可以通过每个区域并确认正确的突出显示。
我的代码如下。输出与@ mpalanco的答案相同。因为他显然得到了正确的结果,并且使用了实现解决方案的方式并没有那么不稳定,所以我会给他答案,假设它有效。如果其他人遇到同样的错误,下面的解决方案可以被认为是一种解决方法。
#install.packages(c("ggplot2", "rgdal"))
library(ggplot2)
library(rgdal)
#rm(list=ls())
#setwd("path")
# read shapefile
tex <- readOGR(dsn = paste0(getwd(), "/Current_Districts/Current_Districts.shp")
# colors to use and districts to highlight
cols<- c("#CCCCCC", "#003082")
districts <- c("Aldine", "Laredo", "Spring Branch", "United", "Donna", "Brownsville", "Houston", "Bryan", "Galena Park", "San Felipe-Del Rio Cons")
# convert shapefile to a df
tex_df <- fortify(tex)
# generate temp df with IDs to merge back in
names_df <- data.frame(tex@data$NAME2)
names(names_df) <- "NAME2"
names_df$id <- seq(0, nrow(names_df)-1) # this is the part I felt was sketchy
final <- merge(tex_df, names_df, by="id")
# dummy out districts of interest
final$yes <- as.factor(ifelse(final$NAME2 %in% districts, 1, 0))
ggplot(data=final) +
geom_polygon(aes(x=long, y=lat, group=group, fill=yes), color="#CCCCCC") +
scale_fill_manual(values=cols) +
theme_void() +
theme(legend.position = "none")