Question

我有一个数据表first_median，其中包括一列location。另一个包含location和city名称的数据表。

我想合并它们，以便初始数据表first_median获得城市名称。

问题是它会为其中一些产生NA个。更清楚地说，坐标44.03125_-123.09375的名称为Eugene。合并后 44.03125_-123.09375的前两个重复映射到Eugene，而其余的重复映射到NA。

下一个奇怪的部分是我将first_median转换为数据帧(as.data.frame(first_median)，然后回到数据表data.table(first_median)，然后执行合并，然后它就可以了！！！

请看一下图片。

知道发生了什么吗？

我也将代码更改为

first_medians_merged_before <- merge(first_medians, LOI, by="location", 

all.x=T)
dput(head(first_medians_merged_before, 5))

first_medians <- as.data.frame(first_medians)
first_medians <- data.table(first_medians)
first_medians_merged_after <- merge(first_medians, LOI, by="location", all.x=T)
dput(head(first_medians_merged_after, 5))

更清楚地说，dput的输出如下：

> dput(head(first_medians_merged_before, 5))
structure(list(location = c("44.03125_-123.09375", "44.03125_-123.09375", 
"44.03125_-123.09375", "44.03125_-123.09375", "44.03125_-123.09375"
), time_period = c("1950-2005", "1950-2005", "1979-2015", "1979-2015", 
"2006-2025"), emission = c("RCP 4.5", "RCP 8.5", "RCP 4.5", "RCP 8.5", 
"RCP 4.5"), median = c(72, 72, 68, 68, 78), city = c("Eugene", 
"Eugene", NA, NA, NA)), sorted = "location", class = c("data.table", 
"data.frame"), row.names = c(NA, -5L), .internal.selfref = <pointer: 0x1028114e0>)

> dput(head(first_medians_merged_after, 5))
structure(list(location = c("44.03125_-123.09375", "44.03125_-123.09375", 
"44.03125_-123.09375", "44.03125_-123.09375", "44.03125_-123.09375"
), time_period = c("1950-2005", "1950-2005", "1979-2015", "1979-2015", 
"2006-2025"), emission = c("RCP 4.5", "RCP 8.5", "RCP 4.5", "RCP 8.5", 
"RCP 4.5"), median = c(72, 72, 68, 68, 78), city = c("Eugene", 
"Eugene", "Eugene", "Eugene", "Eugene")), sorted = "location", class = c("data.table", 
"data.frame"), row.names = c(NA, -5L), .internal.selfref = <pointer: 0x1028114e0>)
>

R中合并产生古怪的问题

知道发生了什么吗？

0 个答案: