Question

我想知道将矢量作为一个整体匹配的方法。我有两个向量a,b

a <- c(5,1,2,6,3,4,8)
b <- c(1,2,3)

我知道一些匹配矢量元素的方法，比如

 match(b,a)
#[1] 2 3 5

b%in%a
#[1] TRUE TRUE TRUE

在match()上，我得到了各个向量元素的位置，对于%in%，我得到了各个向量元素的逻辑。但我希望同时将整个向量b与a匹配。它不应该与单个元素匹配，而是与整个矢量匹配，并获得匹配开始的位置。

期望的输出：

在上面的向量中找不到匹配因为我正在寻找整个向量而不是单个向量项。

Answer 1

如果我们检查na.omit输出的长度（match()）对我们正在测试的矢量

怎么样？

ifelse(length(na.omit(match(b, a))) == length(b), match(b, a)[1], NA)
#[1] 2
#adding a new value in b so it wont match, we get
b  <- c(1, 2, 3, 9)
ifelse(length(na.omit(match(b, a))) == length(b), match(b, a)[1], NA)
#[1] NA

Answer 2

你总是可以强行使用它，只需逐个元素循环遍历矢量。

a <- c(5,1,2,6,3,4,8)
b <- c(1,2,3)

matchr <- function(a,b){

    # First, loop through the a vector
    for(i in 1:(length(a)-length(b))){

        pos <- FALSE

        # Next loop through the b vector, 
        for(j in 1:length(b)){

            # as we're looping through b, check if each element matches the corresponding part of the a vector we're currently at.
            if( a[i+j-1] == b[j]){
                pos <- TRUE
            } else{
                pos <- FALSE
                break
            }
        }

        # if all the elements match, return where we are in the a vector
        if(pos == TRUE){
            return(i)
        } 
    }
    # if we finish the a vector and never got a match, return no match.
    return("No match")
}

matchr(a,b)
[1] "No match"

d <- c(7,5,4,2,1,2,3,8,5)

matchr(d,b)
[1] 5

e <- c(2,3,8)

matchr(d,e)
[1] 6

如果您的真实矢量更大，您可以考虑通过matchr <- compiler::cmpfun(matchr)编译函数或用Rcpp重写它。

编辑：另一种方式

制作一个列表，将a向量拆分为length(b)大小的向量，然后测试list(b)列表中是否a：

matchr2 <- function(a){
    m <- list()
    for(i in 1:(length(a)-length(b))){
        m[[i]] <- c( a[i : (length(b) + i - 1)] ) 
    }
    m
}

mlist <- matchr2(a)

list(b) %in% mlist
[1] FALSE

mlist <- matchr2(d)

list(b) %in% mlist
[1] TRUE

同样，通过编译功能，您将获得显着的速度优势。

Answer 3

一种方法，有几个例子：

wholematch<-function(a=c(5,1,3,2,1,2,5,6,2,6),b=c(1,2,6))
{
    for(loop.a in 1:(length(a)-length(b)))
    {
    #pmatch gives the first occurrence of each value of b in a. To be sure of finding the consecutive matches, use pmatch starting from all the possible positions of "a"
    wmatch<-(loop.a-1)+pmatch(b,a[loop.a:length(a)])
    #If at any time the number of matches is less than the length of the vector to match, we will never find a match. Return NA 
    if(length(na.omit(pmatch(b,a[loop.a:length(a)])))<length(b)) return(NA)
    #If all indices are adjacent, return the vector of indices
    if(max(diff(wmatch))==1) return(wmatch) #return(wmatch[1]) if you only want the start
    }
}

wholematch()
[1] NA

wholematch(a=c(5,1,3,2,1,2,5,6,2,6),b=c(6,2,6))
[1]  8  9 10

如何匹配矢量元素作为一个整体

3 个答案: