C1<-c(3,2,4,4,5)
C2<-c(3,7,3,4,5)
C3<-c(5,4,3,6,3)
DF<-data.frame(ID=c("A","B","C","D","E"),C1=C1,C2=C2,C3=C3)
DF
ID Type C1 C2 C3
1 A 1 3 3 5
2 B 2 2 7 4
3 C 1 4 3 3
4 D 2 4 4 6
5 E 2 5 5 3
如何按类型计算每个列分组的平均值并忽略ID列?即:
Type C1 C2 C3
1 3.50 3.00 4.00
2 3.67 5.00 4.33
谢谢!
答案 0 :(得分:2)
使用Type
列创建数据:
DF <- read.table(header=TRUE, text=' ID Type C1 C2 C3
1 A 1 3 3 5
2 B 2 2 7 4
3 C 1 4 3 3
4 D 2 4 4 6
5 E 2 5 5 3')
然后,在知道ID
列位于第1位的情况下,aggregate
的简单应用可以获得您想要的内容:
aggregate(.~Type, data=DF[-1], FUN=mean)
Type C1 C2 C3
1 1 3.500000 3.000000 4.000000
2 2 3.666667 5.333333 4.333333
答案 1 :(得分:1)
其他一些方法:
### plyr was written with this type of problem in mind
library(plyr)
ddply(DF[-1], .(Type), colMeans)
### staying in base; these are more unwieldly than `aggregate`
t(sapply(split(DF[-c(1,2)], DF$Type), colMeans))
### `ave` also written for similar problems; however will replace all elements
### by vector average (mean) so need to use `unique` afterwards:
unique(with(DF, ave(C1, Type)))
with(DF,
lapply(lapply(DF[-c(1,2)], ave, Type), unique)
)
### faster and scales well on large datasets
library(data.table)
DFt <- as.data.table(DF)
DFt[, list(mean(C1), mean(C2), mean(C3)), by=Type]