r中不同数据框的列均值

时间:2018-10-15 15:48:11

标签: r merge dplyr mean lapply

track-scroll

现在想要的是将df1和df2中val1的值组合为平均值,即df1 <- data.frame(id=c(1,2,3,4,5,6),val1=c(1,2,3,NA,NA,6)) df2 <- data.frame(id=c(3,4,7,6,8) , val1=c(1,2,3,4,5)) 。例如:

df1$val1

df1$val1 <- mean(df1$val1,df2$val2, na.rm=TRUE) & match(by=id) 应该是以下

df1$val1

2 个答案:

答案 0 :(得分:2)

我们可以尝试

library(data.table)
rbindlist(list(df1, df2))[, .(val1 = mean(val1, na.rm = TRUE)), id][id %in% df1$id]

或者另一个选择是

setDT(df1)[df2, val1 := rowMeans(cbind(val1, i.val1), na.rm = TRUE), on = .(id)]

或在评论中提到@Frank

setDT(df1); setDT(df2)
df1[, v := df2[df1, on=.(id), mean(c(x.val1, i.val1),
          na.rm=TRUE), by=.EACHI]$V1]

答案 1 :(得分:1)

我使用整洁的解决方案。

library(dplyr)
df1 <- data.frame(id=c(1,2,3,4,5,6),val1=c(1,2,3,NA,NA,6))
df2 <- data.frame(id=c(3,4,7,6,8) , val1=c(1,2,3,4,5))

df1 %>% left_join(df2, by="id") %>% select(2:3) %>% 
  transmute(val1=rowMeans(., na.rm=T))