我需要将两列中包含相似项目的行汇总在一起。请问有一个“或”功能可以做到这一点吗?我在下面发布了一个样本数据集:
def get(self, request, task_id):
...
如何进行汇总,以使我只得到一个值,将第3行和第5行的频率组合在一起,其中Animal1是B,Animal2是D在第3行,Animal1是D,Animal2是B在第5行给出频率为6?
答案 0 :(得分:2)
这是一个可能的解决方案。我可能使它复杂化了,但是它应该给您想要的结果。我要做的第一件事是使字符串成为数据帧中的因素。
A1 <- data.frame(Animal1= c("A", "A","B","B","D"), Animal2=c("B","D","D","A","B"),
Frequency=c(2,3,1,4,5), stringsAsFactors = FALSE)
A1 %>%
mutate(combined = map2_chr(Animal1, Animal2, ~paste0(sort(c(.x, .y)), collapse = ""))) %>%
group_by(combined) %>%
summarise(total = sum(Frequency))
输出
# A tibble: 3 x 2
combined total
<chr> <dbl>
1 AB 6
2 AD 3
3 BD 6
答案 1 :(得分:1)
我不确定我是否理解您的问题,但这是您要寻找的吗?
library(dplyr)
df %>% as_tibble %>%
filter((Animal1 == "B" & Animal2 == "D") | (Animal1 == "D" & Animal2 == "B")) %>%
summarise(sum_freq = sum(Frequency))
答案 2 :(得分:0)
谢谢,伙计们。除了@StephenK的答案外,我还增加了一个步骤来拆分新的“组合”列。
A1 <- data.frame(Animal1= c("A", "A","B","B","D"), Animal2=c("B","D","D","A","B"),
Frequency=c(2,3,1,4,5), stringsAsFactors = FALSE)
A2<- as.data.frame(A1 %>% mutate(combined = map2_chr(Animal1, Animal2, ~paste0(sort(c(.x, .y)), collapse = ""))) %>%
group_by(combined) %>% summarise(total = sum(Frequency)))
#create new columns for each letter
A2$Animal1 <- substr(A2$combined, start = 1, stop = 1)
A2$Animal2 <- substr(A2$combined, start = 2, stop = 2)
A2
combined total Animal1 Animal2
1 AB 6 A B
2 AD 3 A D
3 BD 6 B D
##Select only columns needed and reorder
A3 <- A2[,c("Animal1","Animal2", "total")]
A3
Animal1 Animal2 total
1 A B 6
2 A D 3
3 B D 6