我正在尝试通过" ward.name"将非空间对象(Merged_Census2011)连接到shapefile多边形(LDN_wards)。它似乎工作正常,直到我查看新创建的对象,并看到所有数据已变成NA。这是我的进展方式。
#Join Merged_Census2011 data to LDN_wards shapefile
LDN_wards <- readOGR(dsn = "data", layer = "LDN_wards")
head(LDN_wards@data)
#Explore the object
plot(LDN_wards)
summary(LDN_wards)
names(Merged_Census2011)
names(LDN_wards)
names(LDN_wards) <- c("Code", "ward.name") #rename LND-wards name heading to ward.name so it can be matched later
#Join datasets
LDN_wards@data <- left_join(LDN_wards@data, Merged_Census2011)
head(LDN_wards@data)
我得到了:
LDN_wards@data <- left_join(LDN_wards@data, Merged_Census2011)
Joining by: "ward.name"
Warning message:
In left_join_impl(x, y, by$x, by$y) :
joining factors with different levels, coercing to character vector
> head(LDN_wards@data)
Code ward.name ward.code.x electorate votescast ward.code.y per.owner per.white per.noquals per.degree per.couple
1 E05000001 Aldersgate <NA> NA NA <NA> NA NA NA NA NA
2 E05000002 Aldgate <NA> NA NA <NA> NA NA NA NA
我有直觉这是因为两组之间有不同的行数。这可能是问题吗?是否无法连接具有不同行级别的数据集(其中一个中的缺失数据仍然是相应的观察结果无法匹配)? 我将两个数据集进行了如下比较:
#Compare the two datasets
nrow(LDN_wards)
nrow(Merged_Census2011)
LDN_wards$ward.name %in% Merged_Census2011$ward.name
LDN_wards$ward.name %in% Merged_Census2011$ward.name
> nrow(LDN_wards)
[1] 787
> nrow(Merged_Census2011)
[1] 668
> LDN_wards$ward.name %in% Merged_Census2011$ward.name
[1] FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSEFALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[21] FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[41] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE ETC...
> summary(LDN_wards$ward.name %in% Merged_Census2011$ward.name)
Mode FALSE TRUE NA's
logical 24 763 0
可能是因为FALSE = 24?如果是,我该如何删除那些FALSE?
道歉,如果这听起来很明显,我相当新:)
感谢您的帮助!
答案 0 :(得分:0)
我刚刚尝试使用(新发现的)inner_join函数,它似乎有效。如果我理解得很好,inner_join函数只会合并匹配的行...所以我认为它会更好。事实上,我不再获得NA值了。但奇怪的是我得到了重复的观察...所以如果有人有更好的建议,非常欢迎你分享。请参阅下面的程序。
#Join datasets
LDN_wards@data <- inner_join(LDN_wards@data, Merged_Census2011)
head(LDN_wards@data, n=10)
> #Join datasets
> LDN_wards@data <- inner_join(LDN_wards@data, Merged_Census2011)
Joining by: c("ward.name", "ward.code.x", "electorate", "votescast","ward.code.y", "per.owner", "per.white", "per.noquals", "per.degree", "per.couple", "per.higher.managerial", "per.christian", "per.no.car", "per.limill", "per.goodhealth", "per.males", "per.aged60plus")
Warning message:
In inner_join_impl(x, y, by$x, by$y) :
joining character vector and factor, coercing into character vector
> head(LDN_wards@data, n=10)
Code ward.name ward.code.x electorate votescast ward.code.y per.owner per.white per.noquals per.degree per.couple
1 E05000007 Bridge E05000497 8677 5654 E05000497 69.8 71.9 19.9 29.9 55.3
2 E05000026 Abbey E05000026 8110 4712 E05000026 32.7 28.1 16.4 34.5 47.2
3 E05000026 Abbey E05000026 8110 4712 E05000455 48.5 73.4 10.1 55.4 52.4
4 E05000026 Abbey E05000455 7250 4808 E05000026 32.7 28.1 16.4 34.5 47.2
5 E05000026 Abbey E05000455 7250 4808 E05000455 48.5 73.4 10.1 55.4 52.4
6 E05000027 Alibon E05000027 6971 4127 E05000027 45.1 70.1 31.2 16.7 49.2
7 E05000028 Becontree E05000028 7535 4538 E05000028 46.7 58.8 28.0 20.6