我试图根据唯一值从数据框中的列中获取均值。因此,在此示例中尝试根据列a中的唯一值来获取列b和列c的均值。我认为。(a)会使它通过a中的唯一值计算(它给出a的唯一值),但它只给出整个列b或c的平均值。
df2<-data.frame(a=seq(1:5),b=c(1:10), c=c(11:20))
simVars <- c("b", "c")
for ( var in simVars ){
print(var)
dat = ddply(df2, .(a), summarize, mean_val = mean(df2[[var]])) ## my script
assign(var, dat)
}
c
a mean_val
1 15.5
2 15.5
3 15.5
4 15.5
5 15.5
如何根据列a?
中的唯一值对列进行平均处理感谢
答案 0 :(得分:0)
您不需要循环。只需在b
的一次调用中计算c
和ddply
的均值,就会为a
的每个值单独计算均值。并且,正如@Gregor所说,您无需在mean()
内重新指定数据框名称:
ddply(df2, .(a), summarise,
mean_b=mean(b),
mean_c=mean(c))
a mean_b mean_c
1 1 3.5 13.5
2 2 4.5 14.5
3 3 5.5 15.5
4 4 6.5 16.5
5 5 7.5 17.5
更新:为每列资料获取单独的数据框:
# Add a few additional columns to the data frame
df2 = data.frame(a=seq(1:5),b=c(1:10), c=c(11:20), d=c(21:30), e=c(31:40))
# New data frame with means by each level of column a
library(dplyr)
dfmeans = df2 %>%
group_by(a) %>%
summarise_each(funs(mean))
# Separate each column of means into a separate data frame and store it in a list:
means.list = lapply(names(dfmeans)[-1], function(x) {
cbind(dfmeans[,"a"], dfmeans[,x])
})
means.list
[[1]]
a b
1 1 3.5
2 2 4.5
3 3 5.5
4 4 6.5
5 5 7.5
[[2]]
a c
1 1 13.5
2 2 14.5
3 3 15.5
4 4 16.5
5 5 17.5
[[3]]
a d
1 1 23.5
2 2 24.5
3 3 25.5
4 4 26.5
5 5 27.5
[[4]]
a e
1 1 33.5
2 2 34.5
3 3 35.5
4 4 36.5
5 5 37.5