我想删除重复的行,但忽略数据框的变量名。 例如:
dat1 = data.frame(var1 = head(letters), var2 = 1:6)
dat1$var1 = as.character(dat1$var1)
dat2 = data.frame(var1 = 1:4, var2 = c("a", "b", "c", "z"))
dat = rbind(dat1, dat2)
# > dat
# var1 var2
# 1 a 1
# 2 b 2
# 3 c 3
# 4 d 4
# 5 e 5
# 6 f 6
# 7 1 a
# 8 2 b
# 9 3 c
# 10 4 z
预期输出:
# > dat
# var1 var2
# 1 a 1
# 2 b 2
# 3 c 3
# 4 d 4
# 5 e 5
# 6 f 6
# 7 4 z
答案 0 :(得分:2)
您可以使用
dat[!duplicated(t(apply(dat,1, sort))),]
答案 1 :(得分:0)
在Base R中,您可以执行以下操作:
dat[!duplicated(do.call(function(x,y)paste(pmax(x,y),pmin(x,y)),unname(dat))),]
var1 var2
1 a 1
2 b 2
3 c 3
4 d 4
5 e 5
6 f 6
10 4 z
您还可以使用igraph
库:
library(igraph)
dat%>%
graph_from_data_frame(directed = F)%>%
E%>%
attr("vnames")%>%
unique%>%
read.table(text=.,sep="|",col.names = c("var1","var2"))
var1 var2
1 a 1
2 b 2
3 c 3
4 d 4
5 e 5
6 f 6
7 4 z