如何按R中两列中的唯一值过滤数据帧?

时间:2016-12-29 16:15:17

标签: r

我有这个数据框:

      [,1]            [,2]           
 [1,] "CHC.AU.Equity" "SGP.AU.Equity"
 [2,] "CMA.AU.Equity" "SGP.AU.Equity"
 [3,] "AJA.AU.Equity" "AOG.AU.Equity"
 [4,] "AJA.AU.Equity" "GOZ.AU.Equity"
 [5,] "AJA.AU.Equity" "SCG.AU.Equity"
 [6,] "ABP.AU.Equity" "AOG.AU.Equity"
 [7,] "AOG.AU.Equity" "FET.AU.Equity"
 [8,] "LIC.AU.Equity" "VRF.AU.Equity"

如何对此进行过滤,以便只包含来自EITHER列的每个字符串的第一个实例,即这些是股票的交易,而且我只能在股票中一次,而不是两次。

更清楚的是,我想要的是执行此操作的代码:

      [,1]            [,2]           
 [1,] "CHC.AU.Equity" "SGP.AU.Equity" <- STAY
 [2,] "CMA.AU.Equity" "SGP.AU.Equity" <- GONE
 [3,] "AJA.AU.Equity" "AOG.AU.Equity" <- STAY
 [4,] "AJA.AU.Equity" "GOZ.AU.Equity" <- GONE
 [5,] "AJA.AU.Equity" "SCG.AU.Equity" <- GONE
 [6,] "ABP.AU.Equity" "AOG.AU.Equity" <- GONE
 [7,] "AOG.AU.Equity" "FET.AU.Equity" <- GONE
 [8,] "LIC.AU.Equity" "VRF.AU.Equity" <- STAY

哪会产生:

      [,1]            [,2]           
 [1,] "CHC.AU.Equity" "SGP.AU.Equity"
 [3,] "AJA.AU.Equity" "AOG.AU.Equity"
 [8,] "LIC.AU.Equity" "VRF.AU.Equity"

我刚刚得到了这个,这似乎有用,但我觉得它有点笨重。让我知道是否有更优雅的方式来做到这一点,或者如果这是有缺陷的(df名称是&#39; test&#39;):

> test[rowSums(t(matrix(duplicated(as.vector(t(test))), nrow = 2))) == 0,]
     [,1]            [,2]           
[1,] "CHC.AU.Equity" "SGP.AU.Equity"
[2,] "AJA.AU.Equity" "AOG.AU.Equity"
[3,] "LIC.AU.Equity" "VRF.AU.Equity"

0 个答案:

没有答案