我创建了一组包含一些缺失值的字符,例如
bp <- rep(NA, 5)
bp[c(2,4)] <- c("sugar","milk")
bp
> bp
[1] NA "sugar" NA "milk" NA
我正在寻找一种方法来使用 bp 来搜索更大的数据框,以便找到 bp (以及在哪里),但填充了NA。
例如,
[1] any1 "sugar" any2 "milk" any3
[2] any2 "sugar" any5 "milk" any1
[3] any6 "sugar" any1 "milk" any3
[4] any8 "sugar" any7 "milk" any6
[5] any1 "sugar" any2 "milk" any3
编辑:数据框的一部分看起来像这样
c("milk", "sugar", "sugar", "creme", "carw", "milk", "creme", "carw",
"sugar", "carw", "creme", "sugar", "sugar", "milk", "milk", "creme",
"sugar", "sugar", "carw", "carw", "carw", "milk", "sugar", "sugar",
"carw", "sugar", "milk", "sugar", "creme", "carw", "carw", "carw",
"creme", "carw", "carw", "creme", "creme", "milk", "carw", "milk",
"milk", "creme", "creme", "creme", "milk", "milk", "creme", "carw",
"carw", "milk", "milk", "creme", "creme", "carw", "carw", "milk",
"sugar", "carw", "milk", "carw", "creme", "sugar", "sugar", "creme",
"sugar", "sugar", "creme", "sugar", "carw", "sugar", "carw",
"carw", "creme", "sugar", "milk", "milk", "carw", "carw", "milk",
"creme", "sugar", "carw", "milk", "sugar", "sugar", "milk", "sugar",
"creme", "milk", "milk", "carw", "milk", "sugar", "carw", "sugar",
"carw", "creme", "creme", "carw", "milk", "milk", "milk", "milk",
"carw", "carw", "milk", "milk", "carw", "sugar", "milk", "milk",
"milk", "creme", "carw", "creme", "milk", "milk", "milk", "creme",
"carw", "milk", "carw", "carw", "carw", "carw", "carw", "carw"
)
我会使用它来搜索整个数据框,但在这种情况下它很棘手。
library(data.table)
n1 <- length(bp)
bp.pos <- setDT(data.frame)[, which(Reduce(`&`, Map(`==`, shift(value1, seq(n1)-1,
type = "lead"),
bp)))]
任何帮助都将不胜感激。
答案 0 :(得分:1)
这是基于我对您的问题的理解。我调用你分享的矢量x
:
test = sapply(seq_along(bp), function(i) bp[i] == x[(0 + i):(length(x) - length(bp) + i)])
test = test | is.na(test)
res = which(apply(test, 1, all))
res = lapply(res, function(x) x + seq_along(bp) - 1)
final = lapply(res, function(z) x[z])
names(final) = lapply(res, "[", 1)
# $`11`
# [1] "creme" "sugar" "sugar" "milk" "milk"
#
# $`12`
# [1] "sugar" "sugar" "milk" "milk" "creme"
#
# $`56`
# [1] "milk" "sugar" "carw" "milk" "carw"
#
# $`73`
# [1] "creme" "sugar" "milk" "milk" "carw"
#
# $`80`
# [1] "creme" "sugar" "carw" "milk" "sugar"
#
# $`83`
# [1] "milk" "sugar" "sugar" "milk" "sugar"
#
# $`86`
# [1] "milk" "sugar" "creme" "milk" "milk"
#
# $`108`
# [1] "carw" "sugar" "milk" "milk" "milk"
结果是一个命名列表,其中名称是x
的起始索引,值是匹配的向量。这为您提供了“where”以及一个对象中的匹配。