我有以下变量列 -
var1 <- c("a", "b", "a", "a", "c", "a", "b", "b", "c", "b", "c", "c", "d")
var2 <- c("a", "a", "b", "c", "a", "d", "b", "c", "b", "d", "c", "d", "d")
mydf <- data.frame(var1, var2)
我想找到唯一的变量组合,例如
(a) var1 a- var2 b and var1 b- var2 a are not considered unique.
(b) no identical combination are present -
for example var1 a and var2 a, var1 b and var2 b
我使用了以下代码,没有提供我期望的内容:
unique(mydf)
var1 var2
1 a a
2 b a
3 a b
4 a c
5 c a
6 a d
7 b b
8 b c
9 c b
10 b d
11 c c
12 c d
13 d d
我的预期输出是:
var1 var2
1 a b
2 a c
3 a d
4 b c
5 b d
6 c d
感谢;
答案 0 :(得分:3)
这应该这样做:
mydf = mydf[mydf[,1] != mydf[,2], ]
mydf = mydf[!duplicated(data.frame(t(apply(mydf, 1, sort)))), ]
> mydf
var1 var2
2 b a
4 a c
6 a d
8 b c
10 b d
12 c d
答案 1 :(得分:0)
更多的练习教自己一些sets
包裹行为:
require(sets)
mydf <- data.frame(var1, var2, stringsAsFactors=FALSE) # unneeded factors are a plague on R/S
dlis <- list();
for (i in seq(nrow(mydf)) ) {
if( length(set(mydf[i,1], mydf[i,2]) )==2 ) {
dlis <- c( dlis, list(set(mydf[i,1], mydf[i,2]))
) } }
unique(dlis)
[[1]]
{"a", "b"}
[[2]]
{"a", "c"}
[[3]]
{"a", "d"}
[[4]]
{"b", "c"}
[[5]]
{"b", "d"}
[[6]]
{"c", "d"}