我有一个像这样的数据框:
data = read.table(text = "region plot species
1 1A A_B
1 1A A_B
1 1B B_C
1 1C A_B
1 1D C_D
2 2A B_C
2 2A B_C
2 2A E_F
2 2B B_C
2 2B E_F
2 2C E_F
2 2D B_C
3 3A A_B
3 3B A_B", stringsAsFactors = FALSE, header = TRUE)
我想比较plot
的每个级别,以获得两个地图比较中唯一species
个匹配的计数。但是,我不希望在相同的图中进行比较(即移除/不包括1A_1A或1B_1B或2C_2C等)。此示例的输出应如下所示:
output<-
region plot freq
1 1A_1B 0
1 1A_1C 1
1 1A_1D 0
1 1B_1C 0
1 1B_1D 0
1 1C_1D 0
2 2A_2B 2
2 2A_2C 1
2 2A_2D 1
2 2B_2C 1
2 2B_2D 1
2 2C_2D 0
3 3A_3B 1
我已经从@HubertL调整了以下代码,Convert list of matrices to a single data frame 但很难纳入适当的if else语句来满足这个条件:
library(tidyverse)
data %>% group_by(region, species) %>%
filter(n() > 1) %>%
summarize(y = list(combn(plot, 2, paste, collapse="_"))) %>%
unnest %>%
group_by(region, y) %>%
summarize(ifelse(plot[i] = plot[i], freq =
length(unique((species),)
答案 0 :(得分:0)
您可以通过添加filter(!duplicated(plot))
data %>% group_by(region, species) %>%
filter(!duplicated(plot)) %>%
filter(n() > 1) %>%
summarize(y = list(combn(plot, 2, paste, collapse="_"))) %>%
unnest %>%
group_by(region, y) %>%
summarize(freq=n())
region y freq
<int> <chr> <int>
1 1 1A_1C 1
2 2 2A_2B 2
3 2 2A_2C 1
4 2 2A_2D 1
5 2 2B_2C 1
6 2 2B_2D 1
7 3 3A_3B 1