我过滤镜像点的大数据集;数据点大小相等但符号相反。这些镜像对倾向于v。大并且使标准偏差偏斜。我的代码有效[即它删除镜像付款对],但需要几个小时才能运行。在R中有更好的方法吗?
以下是代码:
for (i in 1:length(data)) {
for(j in 1:length(data)) {
if (data[i] < 0){
if (abs(data[i]) == abs(data[j])){
mirrors = rbind(mirrors, c(data[i], data[j]))
break
}
}
}
}
数据是大量的付款索赔,约。 200,000件物品。
(我知道,我知道,因为循环是R中的亵渎,但我无法找到另一种方法。)
答案 0 :(得分:0)
如@ mathematical.coffee所示,答案取决于您是删除还是减少镜像值。假设镜像值是可交换的:
M <- c(1:10, -(1:10), 11:25)
## remove all but one set of mirrored duplicates
M[!duplicated(abs(M))] # retains whatever set of mirrored duplicates comes first, positive or negative
unique(abs(M)) # retains positive half of mirrored duplicates
## remove all mirrored duplicate pairs (or triplets, or quadruplets, or...)
d <- which(duplicated(abs(M), fromLast = T) | duplicated(abs(M))) # any duplicated value
M[-d]