通过一些决策规则加速for循环(或远离它)

时间:2018-07-13 16:45:30

标签: r performance for-loop

我有一个for循环,它试图检查向量Ctrl-C中的每个元素,它是否在4个不同的向量(valsdp,{ {1}},up),但是:

  1. 只能在deue中找到一次;并且
  2. 只能在dpde中找到一次

我要检查的向量在数百万个元素中,并且要花费几个小时,我想可以加快以下代码的速度。

MRE:

up

在上面的示例中,ue仅应保留:

  1. vals <- c('a', 'b', 'c', 'd', 'e', 'f') # 6 elements to be verified # only 1 of these two dp <- c('a', 'c', 'd','f', 'f') de <- c('b','a', 'd') # only one of these two up <- c('b', 'd', 'e') ue <- c('c') i <- list() for (val in vals) { dipa <- sum(grepl(val, dp)) # attemps to find val in dp and sums ulpa <- sum(grepl(val, up)) # attemps to find val in up and sums diex <- sum(grepl(val, de)) # attemps to find val in de and sums ulex <- sum(grepl(val, ue)) # attemps to find val in ue and sums f <- sum(sum(dipa) + sum(ulpa) + sum(diex) + sum(ulex)) == 2 # sum two # overall, it has to be found 2 times exactly pars <- dipa + diex == 1 # once in dipa or diex excs <- ulpa + ulex == 1 # once in ulpa or ulex if(isTRUE(f) & isTRUE(pars) & isTRUE(excs)) { i[val] <- 1 #if all of these 3 conditions are true, then add } else { next } } (因为它在i中一次发现,并且在b中一次
  2. up(因为它在de中一次发现,并且在c中一次

val的每个元素在其他4个向量中都可以出现多次,但理想情况下,在上述限制下只能出现两次。

1 个答案:

答案 0 :(得分:1)

这可以吗?

pervec <- sapply(list(dp,de,up,ue),
                 function(a) rowSums(sapply(a, `==`, vals)))
pervec
#      [,1] [,2] [,3] [,4]
# [1,]    1    1    0    0
# [2,]    0    1    1    0
# [3,]    1    0    0    1
# [4,]    1    1    1    0
# [5,]    0    0    1    0
# [6,]    2    0    0    0

ind <- xor(pervec[,1] == 1, pervec[,2] == 1) & xor(pervec[,3] == 1, pervec[,4] == 1)
ind
# [1] FALSE  TRUE  TRUE FALSE FALSE FALSE

vals[ind]
# [1] "b" "c"