Question

我在选择R中的重复行时遇到问题。一个数据中心有14列和100万行。我必须进行行比较，即找出相同的行，将重复。我想通过这种方法获得重复的行。我的数据框就像 Data frame sample

最后两行相同，因此需要将其标记为标志值1。我不知道如何开始。

我已经尝试过这些代码，

df <- unique(data[,1:97]) //this method gives me unique set not number of duplicates.
dim(data[duplicated(data),])[1]  // this method gives me the number of duplicates but not ids.

我需要知道重复的ID。

我的目的是检查每一行并记录重复行的总数或行号。

Answer 1

查看duplicated()函数。可以用来删除重复的行或反向保留它们

通过共同映射R

1 个答案: