我有这个数据框:
[,1] [,2]
[1,] "CHC.AU.Equity" "SGP.AU.Equity"
[2,] "CMA.AU.Equity" "SGP.AU.Equity"
[3,] "AJA.AU.Equity" "AOG.AU.Equity"
[4,] "AJA.AU.Equity" "GOZ.AU.Equity"
[5,] "AJA.AU.Equity" "SCG.AU.Equity"
[6,] "ABP.AU.Equity" "AOG.AU.Equity"
[7,] "AOG.AU.Equity" "FET.AU.Equity"
[8,] "LIC.AU.Equity" "VRF.AU.Equity"
如何对此进行过滤,以便只包含来自EITHER列的每个字符串的第一个实例,即这些是股票的交易,而且我只能在股票中一次,而不是两次。
更清楚的是,我想要的是执行此操作的代码:
[,1] [,2]
[1,] "CHC.AU.Equity" "SGP.AU.Equity" <- STAY
[2,] "CMA.AU.Equity" "SGP.AU.Equity" <- GONE
[3,] "AJA.AU.Equity" "AOG.AU.Equity" <- STAY
[4,] "AJA.AU.Equity" "GOZ.AU.Equity" <- GONE
[5,] "AJA.AU.Equity" "SCG.AU.Equity" <- GONE
[6,] "ABP.AU.Equity" "AOG.AU.Equity" <- GONE
[7,] "AOG.AU.Equity" "FET.AU.Equity" <- GONE
[8,] "LIC.AU.Equity" "VRF.AU.Equity" <- STAY
哪会产生:
[,1] [,2]
[1,] "CHC.AU.Equity" "SGP.AU.Equity"
[3,] "AJA.AU.Equity" "AOG.AU.Equity"
[8,] "LIC.AU.Equity" "VRF.AU.Equity"
我刚刚得到了这个,这似乎有用,但我觉得它有点笨重。让我知道是否有更优雅的方式来做到这一点,或者如果这是有缺陷的(df名称是&#39; test&#39;):
> test[rowSums(t(matrix(duplicated(as.vector(t(test))), nrow = 2))) == 0,]
[,1] [,2]
[1,] "CHC.AU.Equity" "SGP.AU.Equity"
[2,] "AJA.AU.Equity" "AOG.AU.Equity"
[3,] "LIC.AU.Equity" "VRF.AU.Equity"