在数据框中,我有一个包含字符串的列。让我们说它看起来像这样:
x <- unique(df[,1])
x
"A" "A" "B" "B" "B" "C"
我想将所有可能的唯一字符串组合作为2组而不关心他们的顺序,因此A, B
与B, A
相同,我不想获得与A, A
之类的组合相同的值。到目前为止,我到目前为止:
comb <- expand.grid(x, x)
comb <- comb[which(comb[,1] != comb[,2]),]
但是这仍然存在以不同顺序使用具有相同字符串组合的行的问题。我怎么摆脱这个?
答案 0 :(得分:18)
combn
包中有utils
函数:
t(combn(LETTERS[1:3],2))
# [,1] [,2]
# [1,] "A" "B"
# [2,] "A" "C"
# [3,] "B" "C"
我对你x
重复值的原因感到有些困惑。
答案 1 :(得分:12)
我认为您正在寻找combn
:
x <- c("A", "A", "B", "B", "B", "C")
combn(x,2)
给出:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15]
[1,] "A" "A" "A" "A" "A" "A" "A" "A" "A" "B" "B" "B" "B" "B" "B"
[2,] "A" "B" "B" "B" "C" "B" "B" "B" "C" "B" "B" "C" "B" "C" "C"
如果您只想在x
中使用唯一值(我不知道为什么您在x
中有重复值,如果它是unique()
调用的结果):
> combn(unique(x),2)
[,1] [,2] [,3]
[1,] "A" "A" "B"
[2,] "B" "C" "C"