SpatialPoly数据:SpatialData
产量数据:Yield Data
代码:
## Loading packages
library(rgdal)
library(plyr)
library(maps)
library(maptools)
library(mapdata)
library(ggplot2)
library(RColorBrewer)
library(foreign)
library(sp)
## Loading shapefiles and .csv files
#Morocco <- readOGR(dsn=".", layer="Morocco_adm0")
MoroccoReg <- readOGR(dsn=".", layer="Morocco_adm1")
MoroccoYield <- read.csv(file = "F:/Purdue University/RA_Position/PhD_ResearchandDissert/PhD_Draft/Country-CGE/RMaps_Morocco/Morocco_Yield.csv", header=TRUE, sep=",", na.string="NA", dec=".", strip.white=TRUE)
## Reorder the data in the shapefile based on the category variable "ID_1" and change to dataframe
MoroccoReg <- MoroccoReg[order(MoroccoReg$ID_1), ]
MoroccoReg.df <- fortify(MoroccoReg)
## Add the yield impacts column to shapefile from the .csv file by "ID_1"
## Note that in the .csv file, I just added the column "ID_1" to match it with the shapefile
MoroccoReg.df <- cbind(MoroccoReg.df,MoroccoYield,by = 'ID_1')
## Check the structure and contents of shapefile
attributes(MoroccoReg.df)
## Define new theme for map
## I have found this function on the website
theme_map <- function (base_size = 12, base_family = "") {
theme_gray(base_size = base_size, base_family = base_family) %+replace%
theme(
axis.line=element_blank(),
axis.text.x=element_blank(),
axis.text.y=element_blank(),
axis.ticks=element_blank(),
axis.ticks.length=unit(0.3, "lines"),
axis.ticks.margin=unit(0.5, "lines"),
axis.title.x=element_blank(),
axis.title.y=element_blank(),
legend.background=element_rect(fill="white", colour=NA),
legend.key=element_rect(colour="white"),
legend.key.size=unit(1.5, "lines"),
legend.position="right",
legend.text=element_text(size=rel(1.2)),
legend.title=element_text(size=rel(1.4), face="bold", hjust=0),
panel.background=element_blank(),
panel.border=element_blank(),
panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),
panel.margin=unit(0, "lines"),
plot.background=element_blank(),
plot.margin=unit(c(1, 1, 0.5, 0.5), "lines"),
plot.title=element_text(size=rel(1.8), face="bold", hjust=0.5),
strip.background=element_rect(fill="grey90", colour="grey50"),
strip.text.x=element_text(size=rel(0.8)),
strip.text.y=element_text(size=rel(0.8), angle=-90)
)
}
## Plotting
MoroccoRegMap1 <- ggplot(data = MoroccoReg.df, aes(long, lat, group = group))
MoroccoRegMap1 <- MoroccoRegMap1 + geom_polygon(aes(fill = A2Med_noCO2))
MoroccoRegMap1 <- MoroccoRegMap1 + geom_path(colour = 'gray', linestyle = 2)
#MoroccoRegMap <- MoroccoRegMap + scale_fill_gradient(low = "#CC0000",high = "#006600")
MoroccoRegMap1 <- MoroccoRegMap1 + scale_fill_gradient2(name = "%Change in yield",low = "#CC0000",mid = "#FFFFFF",high = "#006600")
MoroccoRegMap1 <- MoroccoRegMap1 + labs(title="SRES_A2, noCO2 Effect")
MoroccoRegMap1 <- MoroccoRegMap1 + coord_equal() + theme_map()
MoroccoRegMap1
结果:
问题:
在Yield数据中,我有一列描述了与“ID_1”列中每个条目对应的标签。我想要实现的是两件事:
1)绘制地图并在地图上添加“ID_1”变量条目作为标签,从而识别每个区域;
2)除了捕获数据中的值之外,还生成第二个图例,以及“ID_1”中的条目及其在数据框“标签”列中的相应描述。
我希望我能清楚地提出问题。
感谢。
答案 0 :(得分:9)
首先,让我为花了这么长时间才能回来道歉 - 我错过了你所有其他人的评论。这是你的想法吗?
这是使用以下代码生成的。在进行解释之前,您应该意识到创建一个图例是您遇到的最少问题。注意两个地图中的颜色是如何不同的。上面的代码不会将CO2更改分配给正确的区域。例如,根据MoroccoYields.csv
,-0.205
中最大的变化(改善?)为Region 4
,但在地图上,最大(最深的红色)位于摩洛哥东北端,实际上是l'Oriental (Region 6)
。代码后面有一个解释。
## Loading packages
library(rgdal)
library(plyr)
library(maps)
library(maptools)
library(mapdata)
library(ggplot2)
library(RColorBrewer)
library(foreign)
library(sp)
# get.centroids: function to extract polygon ID and centroid from shapefile
get.centroids = function(x){
poly = MoroccoReg@polygons[[x]]
ID = poly@ID
centroid = as.numeric(poly@labpt)
return(c(id=ID, long=centroid[1], lat=centroid[2]))
}
#setwd("Directory where shapefile and Yields are stored")
## Loading shapefiles and .csv files
MoroccoReg <- readOGR(dsn=".", layer="Morocco_adm1")
MoroccoYield <- read.csv(file = "Morocco_Yield.csv", header=TRUE, sep=",", na.string="NA", dec=".", strip.white=TRUE)
MoroccoYield$ID_1 <- substr(MoroccoYield$ID_1,3,10)
## Reorder the data in the shapefile based on the category variable "ID_1" and change to dataframe
MoroccoReg <- MoroccoReg[order(MoroccoReg$ID_1), ]
MoroccoYield <- cbind(id=rownames(MoroccoReg@data),MoroccoYield)
# build table of labels for annotation (legend).
labs <- do.call(rbind,lapply(1:14,get.centroids))
labs <- merge(labs,MoroccoYield[,c("id","ID_1","Label")],by="id")
labs[,2:3] <- sapply(labs[,2:3],function(x){as.numeric(as.character(x))})
labs$sort <- as.numeric(as.character(labs$ID_1))
labs <- labs[order(labs$sort),]
MoroccoReg.df <- fortify(MoroccoReg)
## This does NOT work...
## Add the yield impacts column to shapefile from the .csv file by "ID_1"
## Note that in the .csv file, I just added the column "ID_1" to match it with the shapefile
#MoroccoReg.df <- cbind(MoroccoReg.df,MoroccoYield,by = 'ID_1')
## Do it this way...
MoroccoReg.df <- merge(MoroccoReg.df,MoroccoYield, by="id")
## Check the structure and contents of shapefile
attributes(MoroccoReg.df)
## Plotting
MoroccoRegMap1 <- ggplot(data = MoroccoReg.df, aes(long, lat, group=id))
MoroccoRegMap1 <- MoroccoRegMap1 + geom_polygon(aes(fill = A2Med_noCO2))
MoroccoRegMap1 <- MoroccoRegMap1 + geom_path(colour = 'gray', linestyle = 2)
MoroccoRegMap1 <- MoroccoRegMap1 + scale_fill_gradient2(name = "%Change in yield",low = "#CC0000",mid = "#FFFFFF",high = "#006600")
MoroccoRegMap1 <- MoroccoRegMap1 + labs(title="SRES_A2, noCO2 Effect")
MoroccoRegMap1 <- MoroccoRegMap1 + coord_equal() #+ theme_map()
MoroccoRegMap1 <- MoroccoRegMap1 + geom_text(data=labs, aes(x=long, y=lat, label=ID_1), size=4)
MoroccoRegMap1 <- MoroccoRegMap1 + annotate("text", x=max(labs$long)-5, y=min(labs$lat)+3-0.5*(1:14),
label=paste(labs$ID_1,": ",labs$Label,sep=""),
size=3, hjust=0)
MoroccoRegMap1
<强>解释强>
首先,在将收益率数据与地图区域合并时:使用
MoroccoReg.df <- cbind(MoroccoReg.df,MoroccoYield,by = 'ID_1')
这不是cbind(...)
的工作原理。 cbind(...)
只是按列结合它的参数。它不是合并功能。所以你有一个数据框MoroccoReg.df
,有107,800行(地图上每个行端点都有一行),你要将它与MoroccoYield
组合,它有14行(每个区域1个) 。因此cbind(...)
将这14行复制7700次以填充所需的107,800行。表达式by="ID_1"
仅添加另一列名为“by"
且"ID_1"
复制107,800次的列。运行上述语句并键入head(MoroccoReg.df)
并查找最后一列。
那么如何进行合并? R
中有许多功能可以使这很容易,但我无法让它们工作。这就是工作:
shapefile中的每个多边形都有一个ID。 shapefile数据部分中还有一个“ID_1”字段,但这些字段不同且不相关。 [顺便说一句:shapefile数据部分中的ID_1
字段和ID_1
文件中的csv
字段也不同:后者的"TR"
前置于区域编号;所以必须同样处理]。
使用以下命令重新排序shapefile:
MoroccoReg <- MoroccoReg[order(MoroccoReg$ID_1), ]
将更改多边形的顺序,但不会更改其ID。事实证明,多边形ID与shapefile的数据部分中的行名称匹配,因此我将(使用cbind(...)
!)添加到MoroccoYeild
数据框中。
MoroccoYield <- cbind(id=rownames(MoroccoReg@data),MoroccoYield)
所以现在MoroccoYield
有一个id
字段,它映射到多边形ID,还有一个ID_1
字段,用于标识Region。现在我们可以fortify(...)
和merge(...)
。 merge(...)
确实需要by=
个参数。
MoroccoReg.df <- fortify(MoroccoReg)
MoroccoReg.df <- merge(MoroccoReg.df,MoroccoYield, by="id")
这会将您的所有MoroccoYield
列附加到MoroccoReg.df
的相应行。
创建图例:
显而易见的问题是如何定位标签?理想情况下,我们会将MoroccoYield$ID_1
的区号放在每个区域的质心处,然后根据MoroccoYield$Label
创建一个标识区域的图例。
那么在哪里可以找到质心?它们存储在shapefile的polygon部分中的一个模糊位置。总而言之,我创建了一个实用函数get.centroid(...)
,它从多边形中提取质心。然后我将该函数应用于所有多边形,以生成具有相应多边形ID的质心表。然后我将其与MoroccoYield
中的标签合并。这创建了一个数据框labs
,其中包含以下列:
id: polygon ID
long: centroid longitude
lat: centroid latitude
ID_1: region ID
label: region name
sort: a sortable (numeric) version of ID_1
然后,将以下代码添加到您的ggplot ...
...
MoroccoRegMap1 <- MoroccoRegMap1 + geom_text(data=labs, aes(x=long, y=lat, label=label.id), size=4)
MoroccoRegMap1 <- MoroccoRegMap1 + annotate("text", x=max(labs$long)-5, y=min(labs$lat)+3-0.5*(1:14),
label=paste(labs$label.id,": ",labs$Label,sep=""),
size=3, hjust=0)
...创建地图。我找不到一个干净的方法,用正式的“ggplot图例”做到这一点,所以我不得不使用annotate(...)
。定位注释是一种黑客攻击,但似乎有效。
编辑:为了回应@ smailov83的评论,如果您更改代码以创建ggplot到此...
MoroccoRegMap1 <- ggplot(data = MoroccoReg.df, aes(long, lat, group=group))
MoroccoRegMap1 <- MoroccoRegMap1 + geom_polygon(aes(fill = A2Med_noCO2))
MoroccoRegMap1 <- MoroccoRegMap1 + geom_path(colour = 'gray', linestyle = 2)
MoroccoRegMap1 <- MoroccoRegMap1 + scale_fill_gradient2(name = "%Change in yield",low = "#CC0000",mid = "#FFFFFF",high = "#006600")
MoroccoRegMap1 <- MoroccoRegMap1 + labs(title="SRES_A2, noCO2 Effect")
MoroccoRegMap1 <- MoroccoRegMap1 + coord_equal() #+ theme_map()
MoroccoRegMap1 <- MoroccoRegMap1 + geom_text(data=labs, aes(x=long, y=lat, label=ID_1, group=ID_1), size=4)
MoroccoRegMap1 <- MoroccoRegMap1 + annotate("text", x=max(labs$long)-5, y=min(labs$lat)+3-0.5*(1:14),
label=paste(labs$ID_1,": ",labs$Label,sep=""),
size=3, hjust=0)
...你明白了:
我认为我解决了这个问题。地图中额外行的原因是ggplot
必须按列MoroccoReg.df$group
分组(因此,aes(..., group=group)
不是 aes(...,group=id)
) 。但是,当您执行此操作时,ggplot
会尝试在所有图层中按"group"
进行分组。在geom_text(...)
,我们使用的是新的本地数据集 - labs
数据框 - 没有group
列。要解决此问题,我们必须明确将group
设置为geom_text(...)
中的其他内容。底线:这似乎有效。