如何从R中的数据框中获取唯一对?

时间:2016-12-29 15:39:15

标签: r

我有这个数据框:

      [,1]            [,2]           
 [1,] "CHC.AU.Equity" "SGP.AU.Equity"
 [2,] "CMA.AU.Equity" "SGP.AU.Equity"
 [3,] "AJA.AU.Equity" "AOG.AU.Equity"
 [4,] "AJA.AU.Equity" "GOZ.AU.Equity"
 [5,] "AJA.AU.Equity" "SCG.AU.Equity"
 [6,] "ABP.AU.Equity" "AOG.AU.Equity"
 [7,] "AOG.AU.Equity" "FET.AU.Equity"
 [8,] "SGP.AU.Equity" "CHC.AU.Equity"

如何只筛选出唯一的对?例如。 - 在上面的df中,第8行将“匹配”第1行,并被排除在外。我试图使用setequal(),但我似乎无法让它工作。是否有'setunique'类型功能?

1 个答案:

答案 0 :(得分:1)

我们可以尝试使用apply遍历行,sort元素,转置输出,应用duplicated,否定它以返回逻辑索引TRUE / FALSE唯一的和重复的,并使用它来对行进行子集化。

m1[!duplicated(t(apply(m1, 1, sort))),]
#         [,1]            [,2]           
#[1,] "CHC.AU.Equity" "SGP.AU.Equity"
#[2,] "CMA.AU.Equity" "SGP.AU.Equity"
#[3,] "AJA.AU.Equity" "AOG.AU.Equity"
#[4,] "AJA.AU.Equity" "GOZ.AU.Equity"
#[5,] "AJA.AU.Equity" "SCG.AU.Equity"
#[6,] "ABP.AU.Equity" "AOG.AU.Equity"
#[7,] "AOG.AU.Equity" "FET.AU.Equity"