R:针对特定子集按行分组和求和

时间:2018-09-19 20:37:59

标签: r

我无法解决以下问题,因此需要您的帮助:

起点:

x1<-matrix(c("A","A","A","A","A","B","B","B","B","B",
"x1","x2","x3","x4","x5","x1","x2","x3","x4","x5",
1,2,3,4,5,6,7,8,9,10),nrow = 10, ncol = 3)
x1

所需结果:

x2<- matrix(c("A","A","B","B",6,9,21,19),nrow = 4, ncol = 2)
x2
  • 对于A:x1 + x1 + x3 = 1 + 2 + 3 = 6,x4 + x5 = 4 + 5 = 9
  • 对于B:x1 + x2 + x3 = 6 + 7 + 8 = 21,x4 + x5 = 9 + 10 = 19

我想避免生成两个单独的数据集(一个用于x1,x2,x3,另一个用于x4,x5)。有谁知道如何解决这个问题? 非常感谢!

2 个答案:

答案 0 :(得分:3)

data.table

library(data.table)
as.data.table(x1)[, .(vsum = sum(as.numeric(V3))), .(V1, grepl('[1-3]', V2))]
#    V1 grepl vsum
# 1:  A  TRUE    6
# 2:  A FALSE    9
# 3:  B  TRUE   21
# 4:  B FALSE   19

基本R:

aggregate(as.numeric(x1[,3]), by = list(!grepl('[1-3]', x1[, 2]), x1[, 1]), sum)[, -1]
#   Group.2  x
# 1       A  6
# 2       A  9
# 3       B 21
# 4       B 19

答案 1 :(得分:2)

如何?我们创建一个新变量,按感兴趣的两个变量分组,然后进行总结。

library(dplyr)

x1<-matrix(c("A","A","A","A","A","B","B","B","B","B",
"x1","x2","x3","x4","x5","x1","x2","x3","x4","x5",
1,2,3,4,5,6,7,8,9,10),nrow = 10, ncol = 3)

x1 %>% 
  as.data.frame() %>%
  mutate(sub_group = case_when(grepl("[1-3]", V2) ~ 1, TRUE ~ 2),
         V3 = as.numeric(as.character(V3))) %>%
  group_by(V1, sub_group) %>%
  summarise(total = sum(V3)) %>%
  select(-sub_group) 
#> # A tibble: 4 x 2
#> # Groups:   V1 [2]
#>   V1    total
#>   <fct> <dbl>
#> 1 A         6
#> 2 A         9
#> 3 B        21
#> 4 B        19

reprex package(v0.2.0)于2018-09-19创建。