我有一个数据表first_median
,其中包括一列location
。
另一个包含location
和city
名称的数据表。
我想合并它们,以便初始数据表first_median
获得城市名称。
问题是它会为其中一些产生NA
个。更清楚地说,
坐标44.03125_-123.09375
的名称为Eugene
。合并后
44.03125_-123.09375
的前两个重复映射到Eugene
,而其余的重复映射到NA
。
下一个奇怪的部分是我将first_median
转换为数据帧(as.data.frame(first_median)
,
然后回到数据表data.table(first_median)
,然后执行合并,然后它就可以了!!!
请看一下图片。
我也将代码更改为
first_medians_merged_before <- merge(first_medians, LOI, by="location",
all.x=T)
dput(head(first_medians_merged_before, 5))
first_medians <- as.data.frame(first_medians)
first_medians <- data.table(first_medians)
first_medians_merged_after <- merge(first_medians, LOI, by="location", all.x=T)
dput(head(first_medians_merged_after, 5))
更清楚地说,dput
的输出如下:
> dput(head(first_medians_merged_before, 5))
structure(list(location = c("44.03125_-123.09375", "44.03125_-123.09375",
"44.03125_-123.09375", "44.03125_-123.09375", "44.03125_-123.09375"
), time_period = c("1950-2005", "1950-2005", "1979-2015", "1979-2015",
"2006-2025"), emission = c("RCP 4.5", "RCP 8.5", "RCP 4.5", "RCP 8.5",
"RCP 4.5"), median = c(72, 72, 68, 68, 78), city = c("Eugene",
"Eugene", NA, NA, NA)), sorted = "location", class = c("data.table",
"data.frame"), row.names = c(NA, -5L), .internal.selfref = <pointer: 0x1028114e0>)
> dput(head(first_medians_merged_after, 5))
structure(list(location = c("44.03125_-123.09375", "44.03125_-123.09375",
"44.03125_-123.09375", "44.03125_-123.09375", "44.03125_-123.09375"
), time_period = c("1950-2005", "1950-2005", "1979-2015", "1979-2015",
"2006-2025"), emission = c("RCP 4.5", "RCP 8.5", "RCP 4.5", "RCP 8.5",
"RCP 4.5"), median = c(72, 72, 68, 68, 78), city = c("Eugene",
"Eugene", "Eugene", "Eugene", "Eugene")), sorted = "location", class = c("data.table",
"data.frame"), row.names = c(NA, -5L), .internal.selfref = <pointer: 0x1028114e0>)
>