我在R中有一个data.frame,如下所示:
> inputtable <- data.frame(TN = c("T","N","T","N","N","T","T","N"),
+ Value = c(1,1,2,2,2,3,3,5))
> inputtable
TN Value
1 T 1
2 N 1
3 T 2
4 N 2
5 N 2
6 T 3
7 T 3
8 N 5
我想删除Value
列中重复的值,但只有一行有“T”且另一行在TN
列中有“N”。
我玩过重复,但这不能像我编写的那样工作:
TNoverlaps.duprem <- TNoverlaps[ !(duplicated(TNoverlaps$Barcode) & ("T" %in% TNoverlaps$TN & "N" %in% TNoverlaps$TN)), ]
和
TNoverlaps.duprem <- TNoverlaps[ duplicated(TNoverlaps$Barcode) & !duplicated(TNoverlaps$Barcode, TNoverlaps$TN), ]
如果有两行以上,如上面第3-5行所示,我想删除所有这些行,因为TN
列中至少有一行是“T”,一行是“N”。
这是我想要的输出
> outputtable
TN Value
6 T 3
7 T 3
8 N 5
我发现很多关于重复行的问题,并根据多列删除行。但我没有看到一个做过这样的事情。
答案 0 :(得分:2)
你可以尝试:
library(dplyr)
inputtable %>% group_by(Value) %>% filter(!(n_distinct(TN) >= 2))
Source: local data frame [3 x 2]
Groups: Value [2]
TN Value
(fctr) (dbl)
1 T 3
2 T 3
3 N 5