我需要取两个变量的平均值:
df1 <- structure(list(date = c("1/01/2005", "2/01/2005", "3/01/2005",
"31/12/2005"), x5 = c(52L, 90L, NA, NA),x1 = c(33L, 24L, 72L, 52L)), .Names =
c("date","x1","x5"), class = "data.frame", row.names = c(NA, -4L))
df2 <- structure(list(date = c("1/01/2005", "5/04/2006", "13/04/2005",
"31/12/2005"), x2 = c(20L, 50L, NA, NA), x3 = c(NA, NA, NA, NA),
x45 = c(115L, 125L, 127L, 138L)), .Names = c("date", "x2",
"x3", "x1"), class = "data.frame", row.names = c(NA, -4L))
如果df1和df2都存在类似的日期,请取xver的averga:输出diserd:
date x1
1/01/2005 83.5
31/12/2005 138
答案 0 :(得分:2)
您的新data.frame marged
db_merge<-merge(df1,df2,by.x="date",by.y="date")
平均值不包括NA
db_merge$x1<-rowMeans(db_merge[,c(2,6)], na.rm=TRUE)
db_merge<-db_merge[,c(1,7)]
输出
db_merge
date x1
1 1/01/2005 83.5
2 31/12/2005 138.0
答案 1 :(得分:0)
另一种选择可能是使用df1 %>% inner_join(df2, by="date") %>%
rowwise() %>%
group_by(date) %>%
summarise(x1 = mean(c(x1.x, x1.y), na.rm = TRUE))
The result:
# A tibble: 2 x 2
# date x1
# <chr> <dbl>
#1 1/01/2005 83.5
#2 31/12/2005 138
,然后应用{{1}}。这样可以避免添加/改变列。
{{1}}