Question

修改
我的问题很严重。因此，我对其进行了重新编辑，以使其对其他人更有用。它已经有了答案。

示例data.frame：

set.seed(10) 
df <- data.frame(a = sample(1:3, 30, rep=T), b = sample(1:3, 30,  rep = T), c = sample(1:3, 30, rep = T))

我的问题：

我有几列（在我的示例a,b,c中）。现在，与this question asked by R-user略有相似但不同，我想计算可能的＆＃39;值集＆＃39;在这种情况下，三列（但通常是：n列），不管它们的顺序如何。

来自count(df,a,b,c)的

dplyr无效：

require (dplyr)
count(df,a,b,c)
    # A tibble: 17 x 4
           a     b     c     n
       <int> <int> <int> <int>
     1     1     1     1     1
     2     1     1     2     2
     ...
     7     2     1     1     4
     ...

在这个例子中，第2行和第7行包含相同的值集（1,1,2），这不是我想要的，因为我不关心集合中值的顺序，所以＆＃39; 1,1,2＆＃39;和＆＃39; 2,1,1＆＃39;应该被认为是一样的。如何统计这些价值集？

编辑2 @Mouad_S的答案的巧妙之处在于，您首先使用apply()对行进行排序，然后转置结果（t()），然后您可以对列使用count。）

Answer 1

require(dplyr)

set.seed(10) 
df <- data.frame(a = sample(1:3, 30, rep=T),
             b = sample(1:3, 30,  rep = T),
             c = sample(1:3, 30, rep = T))     


 ## the old answer 
 require(dplyr)
 count(data.frame(t(apply(df, 1, function(x) sort(x)))), X1, X2, X3)

## the new answer 
t(apply(df,1, function(x) sort(x))) %>%  # sorting the values of each row
as.data.frame() %>%  # turning the resulting matrix into a data frame
distinct() %>%  # taking the unique values 
nrow()   # counting them 

[1] 9

计算唯一行，与列顺序无关

1 个答案: