R:独特的组合(避免a-b和b-a和相同的a-a,b-b)

时间:2011-11-06 23:56:17

标签: r variables conditional unique

我有以下变量列 -

var1 <- c("a", "b", "a", "a", "c", "a", "b", "b", "c", "b", "c", "c", "d")
var2 <- c("a", "a", "b", "c", "a", "d", "b", "c", "b", "d", "c", "d", "d")
mydf <- data.frame(var1, var2)

我想找到唯一的变量组合,例如

(a) var1 a-  var2 b and var1 b- var2 a are not considered unique.  
(b) no identical combination are present - 
      for example var1 a and var2 a, var1 b and var2 b

我使用了以下代码,没有提供我期望的内容:

unique(mydf)
   var1 var2
1     a    a
2     b    a
3     a    b
4     a    c
5     c    a
6     a    d
7     b    b
8     b    c
9     c    b
10    b    d
11    c    c
12    c    d
13    d    d

我的预期输出是:

  var1 var2
1     a    b
2     a    c
3     a    d
4     b    c
5     b    d
6     c    d

感谢;

2 个答案:

答案 0 :(得分:3)

这应该这样做:

mydf = mydf[mydf[,1] != mydf[,2], ]
mydf = mydf[!duplicated(data.frame(t(apply(mydf, 1, sort)))), ]
> mydf
   var1 var2
2     b    a
4     a    c
6     a    d
8     b    c
10    b    d
12    c    d

答案 1 :(得分:0)

更多的练习教自己一些sets包裹行为:

require(sets)
mydf <- data.frame(var1, var2, stringsAsFactors=FALSE)  # unneeded factors are a plague on R/S
dlis <- list(); 
for (i in seq(nrow(mydf)) ) { 
                  if( length(set(mydf[i,1], mydf[i,2]) )==2 ) {
                         dlis <- c( dlis, list(set(mydf[i,1], mydf[i,2])) 
                    )       }                                  }
 unique(dlis)
[[1]]
{"a", "b"}

[[2]]
{"a", "c"}

[[3]]
{"a", "d"}

[[4]]
{"b", "c"}

[[5]]
{"b", "d"}

[[6]]
{"c", "d"}