Question

这是我的第一个R代码，它是一个非常简单的重复数据删除，但它工作得太慢，我无法相信它！我的问题是：它工作得这么慢还是我的代码很糟糕是正常的吗？这是：

file1=c(read.delim("file.txt", header=TRUE))   

dedupes<-0
i<-1
n<-1
while (i<=100) {

  while (n<=100) {

    if (file1$email[i]==file1$email[n] && i!=n) { 

    #Remember amount of deduces
      dedupes=dedupes+1
    #Show dedupes 
      print(file1$email[i])             }   

    n<-n+1

  } 

  n<-1
  i<-i+1 

}

#Show amount of dedupes
cat("There are ", dedupes/2, " deduces")

非常感谢， Saitam

Answer 1

众所周知，在R中，Imbricated循环很慢。您需要对您的微积分进行矢量化或使用现有的优化函数，例如在BondedDust的建议中

R Studio中的重复数据删除

1 个答案: