在一个变量上折叠数据框

时间:2014-08-28 17:58:09

标签: r dataframe collapse

我有一个以下格式的数据框:

Site    Year    Month   Count1  Count2  Count3  Patch
1        1        May     15      12      10      1
1        1        May     8        0      5       2
1        1        May     3         1      2      3
1        1        May     4        4      1       4
1        1        June    6       5       1       1
1        1        June    9        1      3       2
1        1        June    3       0       0       3
1        1        June    5       5       2       4
1        1        July    4       0       3       1
..........

我希望在补丁级别上折叠数据框,以便对三个计数变量求和。即。

Site    Year    Month   Count1  Count2  Count3  
1        1        May     30      17      18     
1        1        June    23      11       6       
1        1        July     4       0       3      
.........

我已经查看了aggregate和tapply命令,但它们似乎并不是根据需要在补丁中求和。

有人可以建议相应地转换数据的命令。

谢谢。

3 个答案:

答案 0 :(得分:3)

library(dplyr) 
dat %>% 
group_by(Site, Year, Month) %>% 
summarise_each(funs(sum=sum(., na.rm=TRUE)), Count1:Count3)
# Source: local data frame [3 x 6]
#Groups: Site, Year

#    Site Year Month Count1 Count2 Count3
#  1    1    1  July      4      0      3  
#  2    1    1  June     23     11      6
#  3    1    1   May     30     17     18

答案 1 :(得分:3)

data.table解决方案(将按原始月份顺序对数据进行排序)

library(data.table)
setDT(df)[, lapply(.SD, sum), 
            by = list(Site, Year, Month), 
            .SDcols = paste0("Count", seq_len(3))]

#    Site Year Month Count1 Count2 Count3
# 1:    1    1   May     30     17     18
# 2:    1    1  June     23     11      6
# 3:    1    1  July      4      0      3

答案 2 :(得分:2)

使用聚合:

> ( a <- aggregate(.~Site+Year+Month, dat[-length(dat)], sum) )
#   Site Year Month Count1 Count2 Count3
# 1    1    1  July      4      0      3
# 2    1    1  June     23     11      6
# 3    1    1   May     30     17     18

dat是您的数据。

请注意,您在7月份发布的帖子似乎不正确。

对于原始数据顺序的结果,您可以使用

> a[order(as.character(unique(dat$Month))), ]
#   Site Year Month Count1 Count2 Count3
# 3    1    1   May     30     17     18
# 2    1    1  June     23     11      6
# 1    1    1  July      4      0      3