删除重复的行,而不考虑顺序

时间:2019-06-24 18:18:19

标签: r duplicates

我有一个data.frame像这样:

df <- structure(list(X1 = c("PF00041", "PF00041", "PF00041", "PF00041", 
"PF00041", "PF00041", "PF00041", "PF00041", "PF00041", "PF00047", 
"PF00041", "PF00041", "PF00041", "PF00054", "PF00054", "PF02210", 
"PF07679", "PF07714", "PF07714", "PF07714", "PF07714", "PF07714", 
"PF07714", "PF00041", "PF00041", "PF00041"), X2 = c("PF00041", 
"PF00041", "PF00041", "PF00041", "PF00041", "PF00041", "PF07679", 
"PF07679", "PF07679", "PF13895", "PF00047", "PF00047", "PF00047", 
"PF02210", "PF13895", "PF07645", "PF13895", "PF07714", "PF07714", 
"PF07714", "PF07714", "PF07714", "PF07714", "PF13895", "PF13895", 
"PF13895"), pfam_name.x = c("fn3", "fn3", "fn3", "fn3", "fn3", 
"fn3", "fn3", "fn3", "fn3", "ig", "fn3", "fn3", "fn3", "Laminin_G_1", 
"Laminin_G_1", "Laminin_G_2", "I-set", "Pkinase_Tyr", "Pkinase_Tyr", 
"Pkinase_Tyr", "Pkinase_Tyr", "Pkinase_Tyr", "Pkinase_Tyr", "fn3", 
"fn3", "fn3"), pfam_name.y = c("fn3", "fn3", "fn3", "fn3", "fn3", 
"fn3", "I-set", "I-set", "I-set", "Ig_2", "ig", "ig", "ig", "Laminin_G_2", 
"Ig_2", "EGF_CA", "Ig_2", "Pkinase_Tyr", "Pkinase_Tyr", "Pkinase_Tyr", 
"Pkinase_Tyr", "Pkinase_Tyr", "Pkinase_Tyr", "Ig_2", "Ig_2", 
"Ig_2"), value.x = c("5", "5", "13", "13", "17", "17", "5", "13", 
"17", "18", "5", "13", "17", "11", "11", "12", "14", "6", "6", 
"15", "15", "20", "20", "5", "13", "17"), value.y = c("13", "17", 
"5", "17", "5", "13", "14", "14", "14", "19", "18", "18", "18", 
"12", "19", "8", "19", "15", "20", "6", "20", "6", "15", "19", 
"19", "19")), row.names = c(2L, 3L, 4L, 6L, 7L, 8L, 10L, 11L, 
12L, 13L, 15L, 16L, 17L, 19L, 20L, 25L, 27L, 29L, 30L, 31L, 33L, 
34L, 35L, 38L, 39L, 40L), class = "data.frame")

我希望能够基于列value.x和value.y过滤此data.frame,但我不想保留被切换的行。例如,第1行的值分别为5和13,第3行的值分别为13和5,我想摆脱第3行。

我一开始尝试排序,但是因为我还有其他列,所以排序将所有列混合在一起。例如:

data.frame(unique(t(apply(df, 1, sort))), stringsAsFactors = F)

在此表中,我可以看到Pkinase_Tyr现在位于X1列中。

0 个答案:

没有答案