Question

示例数据：

df <- read.table(text="
Ratio col1 col2 col3 col4
0.3    1     1   1     1
0.4    1     1   1     2
0.5    1     1   2     1
0.6    2     2   1     1
", header=TRUE)

我想总结一对Ratio＆amp; col，类似于：

aggregate( Ratio ~ col1, data=df, mean)
aggregate( Ratio ~ col2, data=df, mean)
aggregate( Ratio ~ col3, data=df, mean)
aggregate( Ratio ~ col4, data=df, mean)

如何使用apply函数系列中的一个来重写此函数来计算所有摘要？在现实世界中，此调用必须处理可变数量的列，即col1，col2，...，coln？

Answer 1

您可以使用“长”格式转换数据，此处为data.table示例（但base或dplyr / tidyr也可以）：< / p>

library(data.table)
dt <- as.data.table(df)
dt <- melt(dt, measure.vars = paste0("col", 1:4))
dt[, mean(Ratio), by = list(value, variable)]
#    value variable        V1
# 1:     1     col1 0.4000000
# 2:     2     col1 0.6000000
# 3:     1     col2 0.4000000
# 4:     2     col2 0.6000000
# 5:     1     col3 0.4333333
# 6:     2     col3 0.5000000
# 7:     1     col4 0.4666667
# 8:     2     col4 0.4000000

Answer 2

这似乎是lapply的工作。

fmla_list <- lapply(names(df)[-1], function(x) as.formula(paste(names(df)[1], x, sep = "~")))

agg_list <- lapply(fmla_list, function(fmla) aggregate(fmla, data = df, FUN = mean))
names(agg_list) <- names(df)[-1]
agg_list

编辑。

就像lmo在评论中所说，你也可以用更好的

创建公式列表

fmla_list <- lapply(names(df)[-1], function(x) reformulate(x, names(df)[1]))

Answer 3

do.call(rbind, lapply(names(df)[-1], function(x)
    data.frame(col = x,
               col_value = sapply(split(df[,x], df[,x]), unique),
               Ratio_mean = sapply(split(df$Ratio, df[,x]), mean))))
#    col col_value Ratio_mean
#1  col1         1  0.4000000
#2  col1         2  0.6000000
#11 col2         1  0.4000000
#21 col2         2  0.6000000
#12 col3         1  0.4333333
#22 col3         2  0.5000000
#13 col4         1  0.4666667
#23 col4         2  0.4000000

如何将“聚合”全部“应用”到多列？

3 个答案: