按组合并行

时间:2017-02-28 11:18:40

标签: r

我有一个数据集

>data.frame(GROUP=c("A","A","A","G","G","F","F","E","T"), 
               FIRST=c(10,2,3,6,NA,NA,NA,1,NA), 
               SECOND=c(3,NA,NA,1,NA,4,2,1,NA), 
               THIRD=c(5,7,NA,NA,NA,1,NA,1,1))
      GROUP FIRST SECOND THIRD
    1     A    10      3     5
    2     A     2     NA     7
    3     A     3     NA    NA
    4     G     6      1    NA
    5     G    NA     NA    NA
    6     F    NA      4     1
    7     F    NA      2    NA
    8     E     1      1     1
    9     T    NA     NA     1

我想以两种方式使用GROUP-column组合数据:

组内列的平均值

  GROUP FIRST SECOND THIRD
1     A     5      3     6
2     G     6      1    NA
3     F    NA      3     1
4     E     1      1     1
5     T    NA     NA     1

组内的逐列最大值

  GROUP FIRST SECOND THIRD
1     A    10      3     7
2     G     6      1    NA
3     F    NA      4     1
4     E     1      1     1
5     T    NA     NA     1

有快速的方法可以做到这一点,还是应该创建一个新功能?

1 个答案:

答案 0 :(得分:2)

我们可以使用aggregate

中的base R
aggregate(.~GROUP, d1, mean, na.rm = TRUE, na.action=NULL)

或使用dplyr

library(dplyr)
d1 %>%
  group_by(GROUP) %>%
  summarise_each(funs(mean=mean(., na.rm = TRUE)))

或者

d1 %>%
  group_by(GROUP) %>%
  summarise_each(funs(max=max(., na.rm = TRUE)))