在向量行中找到部分文本+ [r]

时间:2012-11-15 19:02:22

标签: r

基本上我正在寻找一个函数,它将一个字符串向量和一个搜索项作为输入,并输出一个布尔向量。在此之后,我还想获取一个字符串列表并通过相同的函数运行它以输出多个结果向量,每个字符串一个。

所以初始数据如下:

> searchVector <- cbind(c("aaa1","aaa2","","bbb1,aaa1,ccc1", "ddd1,ccc1,aaa1"))
> searchVector
     [,1]         
[1,] "aaa1"        
[2,] "aaa2"        
[3,] ""           
[4,] "bbb1,aaa1,ccc1"
[5,] "ddd1,ccc1,aaa1"

这是我们希望看到的:

>findTrigger(c("aaa","bbb"),searchVector)
         [aaa]  [bbb]
    [1,] 1     0   
    [2,] 1     0   
    [3,] 0     0      
    [4,] 1     1
    [5,] 1     0

我做了以下尝试:

searchfunction <- function (searchTerms, searchVector) {
  output = matrix( nrow = length(searchVector), 
             ncol = length(searchTerms), 
             dimnames = searchTerms)

  for (j in seq(1,length(searchTerms)))
  {
    for (i in seq(1,length(searchVector)))
    { 
      output[i,j]=is.numeric(pmatch(searchTerms[j], searchVector[i]))
    }
  }
  return(as.numeric(output))
}

但我只得到一个1的矩阵。我对R很新,我在网上看了看,但没有运气。任何帮助将不胜感激,谢谢!

2 个答案:

答案 0 :(得分:2)

关键是使用函数grepl。这应该让你开始:

searchVector <- c("aaa1","aaa2","","bbb1,aaa1,ccc1", "ddd1,ccc1,aaa1")

res <- lapply(c('aaa','bbb'),function(pattern,x) as.numeric(grepl(pattern = pattern,x = x)),x = searchVector)
do.call(cbind,res)

要稍微探讨一下,请从grepl开始:

> grepl('aaa',searchVector)
[1]  TRUE  TRUE FALSE  TRUE  TRUE
> as.numeric(grepl('aaa',searchVector))
[1] 1 1 0 1 1

然后我只是将其包裹在lapply中,以循环向量c('aaa','bbb')。这将返回一个向量列表,然后我们将这些向量组合到您使用do.callcbind指示的矩阵中。

答案 1 :(得分:1)

mapplygrepgrepl(感谢joran)是您的朋友:

searchTerms <- c("aaa", "bbb")
searchVector <- cbind(c("aaa1","aaa2","","bbb1,aaa1,ccc1", "ddd1,ccc1,aaa1"))
M <- mapply(grepl, searchTerms, MoreArgs=list(x=searchVector)) 
M
       aaa   bbb
[1,]  TRUE FALSE
[2,]  TRUE FALSE    
[3,] FALSE FALSE
[4,]  TRUE  TRUE
[5,]  TRUE FALSE

如果您想要1,0apply(M,2,as.numeric)