我正在尝试找到一个可扩展的解决方案来根据另一个data.frame更新data.frame。这是一个最小的例子:
df1 <- data.frame(cbind(c("a","b","b","b","c"),c(1,1,1,2,2),as.numeric(c(0.2,0.6,0.6,0.8,0.4))))
colnames(df1) <- c("ID1", "ID2","Value")
> df1
ID1 ID2 Value
1 a 1 0.2
2 b 1 0.6
3 b 1 0.6
4 b 2 0.8
5 c 2 0.4
df2 <- data.frame(cbind(2),0,0.45,0.5)
colnames(df2) <- c("ID2", "a","b","c")
> df2
ID2 a b c
1 2 0 0.45 0.5
现在我想通过使用df2值来更新df1的值以获得以下结果:
ID1 ID2 Value
1 a 1 0.2
2 b 1 0.6
3 b 1 0.6
4 b 2 0.45
5 c 2 0.5
有人可以为此提供帮助吗?
答案 0 :(得分:0)
首先,让我们在创建数据集时摆脱cbind
,这样它就不会先转换为矩阵并搞乱这些类。
df1 <- data.frame(c("a","b","b","b","c"),c(1,1,1,2,2),c(0.2,0.6,0.6,0.8,0.4))
colnames(df1) <- c("ID1", "ID2","Value")
df2 <- data.frame(2,0,0.45,0.5)
colnames(df2) <- c("ID2", "a","b","c")
然后,我会先将您的第二个数据集转换为长格式,然后就地更新df1
library(data.table) #v>=1.9.6
ldf2 <- melt(setDT(df2), 1)
setDT(df1)[ldf2, Value := i.value, on = c(ID1 = "variable", ID2 = "ID2")]
df1
# ID1 ID2 Value
# 1: a 1 0.20
# 2: b 1 0.60
# 3: b 1 0.60
# 4: b 2 0.45
# 5: c 2 0.50