如何使用聚合&姓名&因子功能正常吗?

时间:2014-08-13 11:52:55

标签: r

我想获得平均值& SD。但是我很难在aggregate命令中包含标签和因子。

样本日期:

    ID C1 C2 C3     
     1  3  1  0         
     2  2  1  0    
     3  4  1  0     
     4  4  0  1        
     5  5  0  1      

aggregate (C1 , by = list( C2, C3 ), mean)

输出结果为:

    Group.1 Group.2   x
       1       0      3.0
       1       1      4.5

如何获得标记值并产生此类输出的函数:

       My_Location    Your_location     mean
       my_in           your_out          3.0
       my_in           your_in           4.5

2 个答案:

答案 0 :(得分:1)

如果dat是数据集

 res <- with(dat,aggregate(C1, by=list(Time=C2, Area=C3),mean))
 colnames(res)[3] <- "mean"
 res[,1:2] <- c("yes", "no")[(!res[,1:2])+1]
 res
 #  Time Area mean  
 #1  yes   no 3.0
 #2   no  yes 4.5

数据

 dat <- structure(list(ID = 1:5, C1 = c(3L, 2L, 4L, 4L, 5L), C2 = c(1L, 
 1L, 1L, 0L, 0L), C3 = c(0L, 0L, 0L, 1L, 1L)), .Names = c("ID", 
 "C1", "C2", "C3"), class = "data.frame", row.names = c(NA, -5L
 ))

更新

如果您不想更改组合名称

  aggregate(C1~C2+C3, data=dat, FUN=mean)
  #  C2 C3  C1
 #1  1  0 3.0
 #2  0  1 4.5

一个选项用于setNames并更改名称

 setNames(aggregate(C1~C2+C3, data=dat, FUN=mean), c("Time", "Area", "mean"))
 #   Time Area mean
 #1    1    0  3.0
 #2    0    1  4.5

UPDATE2

使用相同的数据集,但Group.1中的输出不正确。得到那个

 dat$C2 <- 1
 res <- with(dat, aggregate(C1, by=list(My_Location=C2, Your_location=C3), mean))
 colnames(res)[3] <- "mean"
 res[,1:2] <- c("in", "out")[(!res[,1:2])+1]

   res[,1:2] <- Map(function(x,y) paste(x,y,sep="_"), tolower(gsub("\\_.*","",colnames(res)[1:2])), res[,1:2])
  res
  #   My_Location Your_location mean
 #1        my_in      your_out  3.0
 #2        my_in       your_in  4.5

答案 1 :(得分:0)

也可以使用

data.table:

time = sample(c("no","yes"),50,replace=T)
area = sample(c("no","yes"),50,replace=T)
num = sample(1:10, 50, replace=T)

ddt = data.table(time, area, num)

head(ddt)
   time area num
1:  yes  yes   9
2:   no  yes   2
3:  yes  yes   3
4:   no  yes   2
5:  yes   no  10
6:  yes  yes   4

ddt[,mean(num),by= list(time, area)]
   time area       V1
1:  yes  yes 4.636364
2:   no  yes 5.363636
3:  yes   no 5.555556
4:   no   no 7.000000

编辑:一个改变聚合输出格式的简单函数:

output = with(ddf, aggregate (C1 , by = list( C2, C3 ), mean))
output
  Group.1 Group.2   x
1       1       0 3.0
2       0       1 4.5

myfn = function(out){
    names(out)=c("Time","area","mean")
    output[out[]==0]= "no"
    output[out[]==1]= "yes"
    out
}

myfn(output)

Time area mean
1    1    0  3.0
2    0    1  4.5

或使用如下聚合。它显示原始列名称:

aggregate(C1~C2+C3, ddf, mean)
  C2 C3  C1
1  1  0 3.0
2  0  1 4.5