我有一个for循环,它试图检查向量Ctrl-C
中的每个元素,它是否在4个不同的向量(vals
,dp
,{ {1}},up
),但是:
de
或ue
中找到一次;并且dp
或de
中找到一次我要检查的向量在数百万个元素中,并且要花费几个小时,我想可以加快以下代码的速度。
MRE:
up
在上面的示例中,ue
仅应保留:
vals <- c('a', 'b', 'c', 'd', 'e', 'f') # 6 elements to be verified
# only 1 of these two
dp <- c('a', 'c', 'd','f', 'f')
de <- c('b','a', 'd')
# only one of these two
up <- c('b', 'd', 'e')
ue <- c('c')
i <- list()
for (val in vals) {
dipa <- sum(grepl(val, dp)) # attemps to find val in dp and sums
ulpa <- sum(grepl(val, up)) # attemps to find val in up and sums
diex <- sum(grepl(val, de)) # attemps to find val in de and sums
ulex <- sum(grepl(val, ue)) # attemps to find val in ue and sums
f <- sum(sum(dipa) + sum(ulpa) + sum(diex) + sum(ulex)) == 2 # sum two # overall, it has to be found 2 times exactly
pars <- dipa + diex == 1 # once in dipa or diex
excs <- ulpa + ulex == 1 # once in ulpa or ulex
if(isTRUE(f) & isTRUE(pars) & isTRUE(excs)) {
i[val] <- 1 #if all of these 3 conditions are true, then add
} else {
next
}
}
(因为它在i
中一次发现,并且在b
中一次up
(因为它在de
中一次发现,并且在c
中一次val的每个元素在其他4个向量中都可以出现多次,但理想情况下,在上述限制下只能出现两次。
答案 0 :(得分:1)
这可以吗?
pervec <- sapply(list(dp,de,up,ue),
function(a) rowSums(sapply(a, `==`, vals)))
pervec
# [,1] [,2] [,3] [,4]
# [1,] 1 1 0 0
# [2,] 0 1 1 0
# [3,] 1 0 0 1
# [4,] 1 1 1 0
# [5,] 0 0 1 0
# [6,] 2 0 0 0
ind <- xor(pervec[,1] == 1, pervec[,2] == 1) & xor(pervec[,3] == 1, pervec[,4] == 1)
ind
# [1] FALSE TRUE TRUE FALSE FALSE FALSE
vals[ind]
# [1] "b" "c"