我有一个如下数据框:
df <- data.frame(category = c('A','B','C','A','C','B'),
value = c(5.4, 5, 3.4,7.5,6.7,3.5),
status = c('HC','D','D','HC','HC','D'))
我想计算所有类别和状态组合的值的均值。例如,('A','HC')
和('B','HC')
的平均值。如果只有一个值,它应该只输出奇异值。
如何做到这一点?
答案 0 :(得分:0)
您可以使用dplyr
或data.table
require(dplyr)
df %>% group_by(category,status) %>%
summarize(mean_value=mean(value))
category status mean_value
<fctr> <fctr> <dbl>
1 A HC 6.45
2 B D 4.25
3 C D 3.40
4 C HC 6.70
另请参阅@PoGibas,请重新发布data.table
个答案。
答案 1 :(得分:0)
这是另一种解决方案
# defining the set of all combinations of category/statuts in df
all.combinations <- unique(paste(df$category, df$status, sep = ";"))
# creating a function that will return the mean of one given combination
fun1 <- function(x){
indices <- which(paste(df$category, df$status, sep = ";") == all.combinations[x])
sigma <- mean(df$value[indices])
return(sigma)
}
# finally applying our function to all possible combinations
sapply(1:length(all.combinations), fun1)
[1] 6.45 4.25 3.40 6.70