如何从数据框中删除重复数据

时间:2018-09-20 14:25:50

标签: r duplicates

到目前为止,我的代码看起来像这样,我一直在尝试消除重复的新旧矢量中的字母。字母代表电子邮件。我尝试使用唯一且不同的函数,但是当我需要擦除所有重复值时,它们会保留重复值之一。这就是我想要的向量

c(b,c,e,f,t,r,w,u,p,q)
new <- c("a","b","c","d","e","f","t")
old <- c("r","w","u","a","d","p","q")
num <- c(1:7)
df_new <- data.frame(num, new)
df_old <- data.frame(num, old)

df_new <- transmute(df_new, num, emails = new)
df_old <- transmute(df_old, num, emails = old)

all_emails <- merge(df_new, df_old, all = TRUE)

1 个答案:

答案 0 :(得分:1)

根据显示的内容,您不必要地通过将它们放入数据框中来使其复杂化。试试这个:

new <- c("a","b","c","d","e","f","t")
old <- c("r","w","u","a","d","p","q")
x = c(new, old)
result = x[!duplicated(x) & !duplicated(x, fromLast = TRUE)]
result
# [1] "b" "c" "e" "f" "t" "r" "w" "u" "p" "q"

另一种方法,如果两个向量都是唯一的,则只需要删除newold中的所有内容:

result = setdiff(union(new, old), intersect(new, old))
相关问题