Question

我希望使用tidyverse（理想情况下）获取数据框唯一字符串列的所有唯一成对组合。

这是一个虚拟的例子：

library(tidyverse)

a <- letters[1:3] %>% 
        tibble::as_tibble()
a
#> # A tibble: 3 x 1
#>   value
#>   <chr>
#> 1     a
#> 2     b
#> 3     c

tidyr::crossing(a, a) %>% 
    magrittr::set_colnames(c("words1", "words2"))
#> # A tibble: 9 x 2
#>   words1 words2
#>    <chr>  <chr>
#> 1      a      a
#> 2      a      b
#> 3      a      c
#> 4      b      a
#> 5      b      b
#> 6      b      c
#> 7      c      a
#> 8      c      b
#> 9      c      c

有没有办法删除＆＃39;重复＆＃39;这里的组合。在这个例子中输出如下：

# A tibble: 9 x 2
#>   words1 words2
#>    <chr>  <chr>
#> 1      a      b
#> 2      a      c
#> 3      b      c

我希望有一个很好的purrr::map或filter方法来管理以完成上述工作。

编辑：这个问题有类似问题，例如： here，由@Sotos标记。在这里，我特意寻找tidyverse（purrr，dplyr）方法来完成我设置的管道。其他答案使用我不希望包含的各种其他包作为依赖项。

Answer 1

希望有更好的方法，但我通常会使用它...

library(tidyverse)

df <- tibble(value = letters[1:3])

df %>% 
  expand(value, value) %>% 
  filter(value < value1)

# # A tibble: 3 x 2
#   value value1
#   <chr> <chr> 
# 1 a     b     
# 2 a     c     
# 3 b     c

Answer 2

这样的东西？

tidyr::crossing(a, a) %>% 
  magrittr::set_colnames(c("words1", "words2")) %>%
  rowwise() %>%
  mutate(words1 = sort(c(words1, words2))[1],       # sort order of words for each row
         words2 = sort(c(words1, words2))[2]) %>%
  filter(words1 != words2) %>%                      # remove word combinations with itself
  unique()                                          # remove duplicates

# A tibble: 3 x 2
  words1 words2
   <chr>  <chr>
1      a      b
2      a      c
3      b      c

tidyr - 获得组合的独特方式（仅使用tidyverse）

2 个答案: