如何使用ggplot2为描述相关标签的图例添加区域地图图例?

时间:2013-12-04 14:49:10

标签: r map ggplot2

SpatialPoly数据:SpatialData

产量数据:Yield Data

代码:

    ## Loading packages
    library(rgdal)
    library(plyr)
    library(maps)
    library(maptools)
    library(mapdata)
    library(ggplot2)
    library(RColorBrewer)
    library(foreign)  
    library(sp)

    ## Loading shapefiles and .csv files
    #Morocco <- readOGR(dsn=".", layer="Morocco_adm0")
    MoroccoReg <- readOGR(dsn=".", layer="Morocco_adm1")
    MoroccoYield <- read.csv(file = "F:/Purdue University/RA_Position/PhD_ResearchandDissert/PhD_Draft/Country-CGE/RMaps_Morocco/Morocco_Yield.csv", header=TRUE, sep=",", na.string="NA", dec=".", strip.white=TRUE)

    ## Reorder the data in the shapefile based on the category variable "ID_1" and change to dataframe
    MoroccoReg <- MoroccoReg[order(MoroccoReg$ID_1), ]
    MoroccoReg.df <- fortify(MoroccoReg)

    ## Add the yield impacts column to shapefile from the .csv file by "ID_1"
    ## Note that in the .csv file, I just added the column "ID_1" to match it with the shapefile
    MoroccoReg.df <- cbind(MoroccoReg.df,MoroccoYield,by = 'ID_1')

    ## Check the structure and contents of shapefile
    attributes(MoroccoReg.df)

    ## Define new theme for map
    ## I have found this function on the website
    theme_map <- function (base_size = 12, base_family = "") {
    theme_gray(base_size = base_size, base_family = base_family) %+replace% 
    theme(
    axis.line=element_blank(),
    axis.text.x=element_blank(),
    axis.text.y=element_blank(),
    axis.ticks=element_blank(),
    axis.ticks.length=unit(0.3, "lines"),
    axis.ticks.margin=unit(0.5, "lines"),
    axis.title.x=element_blank(),
    axis.title.y=element_blank(),
    legend.background=element_rect(fill="white", colour=NA),
    legend.key=element_rect(colour="white"),
    legend.key.size=unit(1.5, "lines"),
    legend.position="right",
    legend.text=element_text(size=rel(1.2)),
    legend.title=element_text(size=rel(1.4), face="bold", hjust=0),
    panel.background=element_blank(),
    panel.border=element_blank(),
    panel.grid.major=element_blank(),
    panel.grid.minor=element_blank(),
    panel.margin=unit(0, "lines"),
    plot.background=element_blank(),
    plot.margin=unit(c(1, 1, 0.5, 0.5), "lines"),
    plot.title=element_text(size=rel(1.8), face="bold", hjust=0.5),
    strip.background=element_rect(fill="grey90", colour="grey50"),
    strip.text.x=element_text(size=rel(0.8)),
    strip.text.y=element_text(size=rel(0.8), angle=-90) 
    )   
    }

    ## Plotting 

    MoroccoRegMap1 <- ggplot(data = MoroccoReg.df, aes(long, lat, group = group)) 
    MoroccoRegMap1 <- MoroccoRegMap1 + geom_polygon(aes(fill = A2Med_noCO2))
    MoroccoRegMap1 <- MoroccoRegMap1 + geom_path(colour = 'gray', linestyle = 2)
    #MoroccoRegMap <- MoroccoRegMap + scale_fill_gradient(low = "#CC0000",high = "#006600")
    MoroccoRegMap1 <- MoroccoRegMap1 + scale_fill_gradient2(name = "%Change in yield",low = "#CC0000",mid = "#FFFFFF",high = "#006600")
    MoroccoRegMap1 <- MoroccoRegMap1 + labs(title="SRES_A2, noCO2 Effect")
    MoroccoRegMap1 <- MoroccoRegMap1 + coord_equal() + theme_map()
    MoroccoRegMap1

结果:

Map

问题:

在Yield数据中,我有一列描述了与“ID_1”列中每个条目对应的标签。我想要实现的是两件事:

1)绘制地图并在地图上添加“ID_1”变量条目作为标签,从而识别每个区域;

2)除了捕获数据中的值之外,还生成第二个图例,以及“ID_1”中的条目及其在数据框“标签”列中的相应描述。

我希望我能清楚地提出问题。

感谢。

1 个答案:

答案 0 :(得分:9)

首先,让我为花了这么长时间才能回来道歉 - 我错过了你所有其他人的评论。这是你的想法吗?

这是使用以下代码生成的。在进行解释之前,您应该意识到创建一个图例是您遇到的最少问题。注意两个地图中的颜色是如何不同的。上面的代码不会将CO2更改分配给正确的区域。例如,根据MoroccoYields.csv-0.205中最大的变化(改善?)为Region 4,但在地图上,最大(最深的红色)位于摩洛哥东北端,实际上是l'Oriental (Region 6)。代码后面有一个解释。

## Loading packages
library(rgdal)
library(plyr)
library(maps)
library(maptools)
library(mapdata)
library(ggplot2)
library(RColorBrewer)
library(foreign)  
library(sp)

# get.centroids: function to extract polygon ID and centroid from shapefile
get.centroids = function(x){
  poly = MoroccoReg@polygons[[x]]
  ID   = poly@ID
  centroid = as.numeric(poly@labpt)
  return(c(id=ID, long=centroid[1], lat=centroid[2]))
}
#setwd("Directory where shapefile and Yields are stored")
## Loading shapefiles and .csv files
MoroccoReg        <- readOGR(dsn=".", layer="Morocco_adm1")
MoroccoYield      <- read.csv(file = "Morocco_Yield.csv", header=TRUE, sep=",", na.string="NA", dec=".", strip.white=TRUE)
MoroccoYield$ID_1 <- substr(MoroccoYield$ID_1,3,10)

## Reorder the data in the shapefile based on the category variable "ID_1" and change to dataframe
MoroccoReg    <- MoroccoReg[order(MoroccoReg$ID_1), ]
MoroccoYield  <- cbind(id=rownames(MoroccoReg@data),MoroccoYield)
#  build table of labels for annotation (legend).
labs          <- do.call(rbind,lapply(1:14,get.centroids))
labs          <- merge(labs,MoroccoYield[,c("id","ID_1","Label")],by="id")
labs[,2:3]    <- sapply(labs[,2:3],function(x){as.numeric(as.character(x))})
labs$sort <- as.numeric(as.character(labs$ID_1))
labs          <- labs[order(labs$sort),]

MoroccoReg.df <- fortify(MoroccoReg)
## This does NOT work...
## Add the yield impacts column to shapefile from the .csv file by "ID_1"
## Note that in the .csv file, I just added the column "ID_1" to match it with the shapefile
#MoroccoReg.df <- cbind(MoroccoReg.df,MoroccoYield,by = 'ID_1')
## Do it this way...
MoroccoReg.df <- merge(MoroccoReg.df,MoroccoYield, by="id")

## Check the structure and contents of shapefile
attributes(MoroccoReg.df)
## Plotting 

MoroccoRegMap1 <- ggplot(data = MoroccoReg.df, aes(long, lat, group=id)) 
MoroccoRegMap1 <- MoroccoRegMap1 + geom_polygon(aes(fill = A2Med_noCO2))
MoroccoRegMap1 <- MoroccoRegMap1 + geom_path(colour = 'gray', linestyle = 2)
MoroccoRegMap1 <- MoroccoRegMap1 + scale_fill_gradient2(name = "%Change in yield",low = "#CC0000",mid = "#FFFFFF",high = "#006600")
MoroccoRegMap1 <- MoroccoRegMap1 + labs(title="SRES_A2, noCO2 Effect")
MoroccoRegMap1 <- MoroccoRegMap1 + coord_equal() #+ theme_map()
MoroccoRegMap1 <- MoroccoRegMap1 + geom_text(data=labs, aes(x=long, y=lat, label=ID_1), size=4)
MoroccoRegMap1 <- MoroccoRegMap1 + annotate("text", x=max(labs$long)-5, y=min(labs$lat)+3-0.5*(1:14),
                                            label=paste(labs$ID_1,": ",labs$Label,sep=""),
                                            size=3, hjust=0)
MoroccoRegMap1

<强>解释

首先,在将收益率数据与地图区域合并时:使用

MoroccoReg.df <- cbind(MoroccoReg.df,MoroccoYield,by = 'ID_1')

这不是cbind(...)的工作原理。 cbind(...)只是按列结合它的参数。它不是合并功能。所以你有一个数据框MoroccoReg.df,有107,800行(地图上每个行端点都有一行),你要将它与MoroccoYield组合,它有14行(每个区域1个) 。因此cbind(...)将这14行复制7700次以填充所需的107,800行。表达式by="ID_1"仅添加另一列名为“by""ID_1"复制107,800次的列。运行上述语句并键入head(MoroccoReg.df)并查找最后一列。

那么如何进行合并? R中有许多功能可以使这很容易,但我无法让它们工作。这就是工作:

shapefile中的每个多边形都有一个ID。 shapefile数据部分中还有一个“ID_1”字段,但这些字段不同且不相关。 [顺便说一句:shapefile数据部分中的ID_1字段和ID_1文件中的csv字段也不同:后者的"TR"前置于区域编号;所以必须同样处理]。 使用以下命令重新排序shapefile:

MoroccoReg    <- MoroccoReg[order(MoroccoReg$ID_1), ]

将更改多边形的顺序,但不会更改其ID。事实证明,多边形ID与shapefile的数据部分中的行名称匹配,因此我将(使用cbind(...)!)添加到MoroccoYeild数据框中。

MoroccoYield  <- cbind(id=rownames(MoroccoReg@data),MoroccoYield)

所以现在MoroccoYield有一个id字段,它映射到多边形ID,还有一个ID_1字段,用于标识Region。现在我们可以fortify(...)merge(...)merge(...)确实需要by=个参数。

MoroccoReg.df <- fortify(MoroccoReg)
MoroccoReg.df <- merge(MoroccoReg.df,MoroccoYield, by="id")

这会将您的所有MoroccoYield列附加到MoroccoReg.df的相应行。

创建图例:

显而易见的问题是如何定位标签?理想情况下,我们会将MoroccoYield$ID_1的区号放在每个区域的质心处,然后根据MoroccoYield$Label创建一个标识区域的图例。

那么在哪里可以找到质心?它们存储在shapefile的polygon部分中的一个模糊位置。总而言之,我创建了一个实用函数get.centroid(...),它从多边形中提取质心。然后我将该函数应用于所有多边形,以生成具有相应多边形ID的质心表。然后我将其与MoroccoYield中的标签合并。这创建了一个数据框labs,其中包含以下列:

id:    polygon ID
long:  centroid longitude
lat:   centroid latitude
ID_1:  region ID
label: region name
sort:  a sortable (numeric) version of ID_1

然后,将以下代码添加到您的ggplot ...

...
MoroccoRegMap1 <- MoroccoRegMap1 + geom_text(data=labs, aes(x=long, y=lat, label=label.id), size=4)
MoroccoRegMap1 <- MoroccoRegMap1 + annotate("text", x=max(labs$long)-5, y=min(labs$lat)+3-0.5*(1:14),
                                            label=paste(labs$label.id,": ",labs$Label,sep=""),
                                            size=3, hjust=0)

...创建地图。我找不到一个干净的方法,用正式的“ggplot图例”做到这一点,所以我不得不使用annotate(...)。定位注释是一种黑客攻击,但似乎有效。

编辑:为了回应@ smailov83的评论,如果您更改代码以创建ggplot到此...

MoroccoRegMap1 <- ggplot(data = MoroccoReg.df, aes(long, lat, group=group)) 
MoroccoRegMap1 <- MoroccoRegMap1 + geom_polygon(aes(fill = A2Med_noCO2))
MoroccoRegMap1 <- MoroccoRegMap1 + geom_path(colour = 'gray', linestyle = 2)
MoroccoRegMap1 <- MoroccoRegMap1 + scale_fill_gradient2(name = "%Change in yield",low = "#CC0000",mid = "#FFFFFF",high = "#006600")
MoroccoRegMap1 <- MoroccoRegMap1 + labs(title="SRES_A2, noCO2 Effect")
MoroccoRegMap1 <- MoroccoRegMap1 + coord_equal() #+ theme_map()
MoroccoRegMap1 <- MoroccoRegMap1 + geom_text(data=labs, aes(x=long, y=lat, label=ID_1, group=ID_1), size=4)
MoroccoRegMap1 <- MoroccoRegMap1 + annotate("text", x=max(labs$long)-5, y=min(labs$lat)+3-0.5*(1:14),
                                            label=paste(labs$ID_1,": ",labs$Label,sep=""),
                                            size=3, hjust=0)

...你明白了:

我认为我解决了这个问题。地图中额外行的原因是ggplot必须按列MoroccoReg.df$group分组(因此,aes(..., group=group) 不是 aes(...,group=id)) 。但是,当您执行此操作时,ggplot会尝试在所有图层中按"group"进行分组。在geom_text(...),我们使用的是新的本地数据集 - labs数据框 - 没有group列。要解决此问题,我们必须明确将group设置为geom_text(...)中的其他内容。底线:这似乎有效。