我想获得平均值& SD。但是我很难在aggregate命令中包含标签和因子。
样本日期:
ID C1 C2 C3
1 3 1 0
2 2 1 0
3 4 1 0
4 4 0 1
5 5 0 1
aggregate (C1 , by = list( C2, C3 ), mean)
输出结果为:
Group.1 Group.2 x
1 0 3.0
1 1 4.5
如何获得标记值并产生此类输出的函数:
My_Location Your_location mean
my_in your_out 3.0
my_in your_in 4.5
答案 0 :(得分:1)
如果dat
是数据集
res <- with(dat,aggregate(C1, by=list(Time=C2, Area=C3),mean))
colnames(res)[3] <- "mean"
res[,1:2] <- c("yes", "no")[(!res[,1:2])+1]
res
# Time Area mean
#1 yes no 3.0
#2 no yes 4.5
dat <- structure(list(ID = 1:5, C1 = c(3L, 2L, 4L, 4L, 5L), C2 = c(1L,
1L, 1L, 0L, 0L), C3 = c(0L, 0L, 0L, 1L, 1L)), .Names = c("ID",
"C1", "C2", "C3"), class = "data.frame", row.names = c(NA, -5L
))
如果您不想更改组合名称
aggregate(C1~C2+C3, data=dat, FUN=mean)
# C2 C3 C1
#1 1 0 3.0
#2 0 1 4.5
一个选项用于setNames
并更改名称
setNames(aggregate(C1~C2+C3, data=dat, FUN=mean), c("Time", "Area", "mean"))
# Time Area mean
#1 1 0 3.0
#2 0 1 4.5
使用相同的数据集,但Group.1中的输出不正确。得到那个
dat$C2 <- 1
res <- with(dat, aggregate(C1, by=list(My_Location=C2, Your_location=C3), mean))
colnames(res)[3] <- "mean"
res[,1:2] <- c("in", "out")[(!res[,1:2])+1]
res[,1:2] <- Map(function(x,y) paste(x,y,sep="_"), tolower(gsub("\\_.*","",colnames(res)[1:2])), res[,1:2])
res
# My_Location Your_location mean
#1 my_in your_out 3.0
#2 my_in your_in 4.5
答案 1 :(得分:0)
data.table:
time = sample(c("no","yes"),50,replace=T)
area = sample(c("no","yes"),50,replace=T)
num = sample(1:10, 50, replace=T)
ddt = data.table(time, area, num)
head(ddt)
time area num
1: yes yes 9
2: no yes 2
3: yes yes 3
4: no yes 2
5: yes no 10
6: yes yes 4
ddt[,mean(num),by= list(time, area)]
time area V1
1: yes yes 4.636364
2: no yes 5.363636
3: yes no 5.555556
4: no no 7.000000
编辑:一个改变聚合输出格式的简单函数:
output = with(ddf, aggregate (C1 , by = list( C2, C3 ), mean))
output
Group.1 Group.2 x
1 1 0 3.0
2 0 1 4.5
myfn = function(out){
names(out)=c("Time","area","mean")
output[out[]==0]= "no"
output[out[]==1]= "yes"
out
}
myfn(output)
Time area mean
1 1 0 3.0
2 0 1 4.5
或使用如下聚合。它显示原始列名称:
aggregate(C1~C2+C3, ddf, mean)
C2 C3 C1
1 1 0 3.0
2 0 1 4.5