我的df结构如下:
Ateco. Numb. Reg
10 223 A
11 332 A
12 343 A
10 223 B
11 332 B
12 343 B
29 414 B
30 434 B
31 444 B
32 464 B
我想获得另一个df,其中numb是我选择的Ateco值的总和。
Ateco. Numb. Reg
10_11_12 898 A
10_11_12 898 B
29 414 B
30 434 B
31 444 B
32 464 B
我该怎么办?
答案 0 :(得分:2)
基于更新的输入示例(按“ Reg”分组并且在“ Ateco。”中存在值10到12),获得“数字”的sum
。和paste
的“ Ateco”元素ungroup
,并在需要时删除“ grp”
library(tidyverse)
df %>%
group_by(Reg, grp = Ateco. %in% 10:12) %>%
summarise(Numb. = sum(Numb.),
Ateco. = paste(Ateco., collapse="_")) %>%
ungroup %>%
select(-grp)
# A tibble: 3 x 3
# Reg Numb. Ateco.
# <chr> <int> <chr>
#1 A 898 10_11_12
#2 B 414 29
#3 B 898 10_11_12
如果我们假设“ grp”是根据两个“ Reg”元素中“ Ateco”值的出现而创建的,则
df %>%
group_by(Ateco.) %>%
group_by(grp = n_distinct(Reg) > 1, Reg) %>%
summarise(Numb. = sum(Numb.),
Ateco. = paste(Ateco., collapse="_")) %>%
ungroup %>%
select(-grp)
基于新的数据集
df2 %>%
group_by(Ateco. = case_when(Ateco. %in% 10:12 ~ '10_11_12',
TRUE ~ as.character(Ateco.)), Reg) %>%
summarise(Numb. = sum(Numb.))
# A tibble: 6 x 3
# Groups: Ateco. [?]
# Ateco. Reg Numb.
# <chr> <chr> <int>
#1 10_11_12 A 898
#2 10_11_12 B 898
#3 29 B 414
#4 30 B 434
#5 31 B 444
#6 32 B 464
df <- structure(list(Ateco. = c(10L, 11L, 12L, 10L, 11L, 12L, 29L),
Numb. = c(223L, 332L, 343L, 223L, 332L, 343L, 414L), Reg = c("A",
"A", "A", "B", "B", "B", "B")), class = "data.frame", row.names = c(NA,
-7L))
df2 <- structure(list(Ateco. = c(10L, 11L, 12L, 10L, 11L, 12L, 29L,
30L, 31L, 32L), Numb. = c(223L, 332L, 343L, 223L, 332L, 343L,
414L, 434L, 444L, 464L), Reg = c("A", "A", "A", "B", "B", "B",
"B", "B", "B", "B")), class = "data.frame", row.names = c(NA,
-10L))