避免创建与任何字符串不匹配的列

时间:2017-08-30 06:38:54

标签: r grepl

我想基于字符串匹配创建新列。我能够创建,但它也创建了不匹配的列。举个例子:

      x = data.frame(name = c("Java Hackathon", "Intro to Graphs", "Hands on 
          Cypher"))
      toMatch <- c("Hackathon","Hands on","Test","java")


      ##Sentence with phrases
      phrases11 <- as.vector(toMatch) 
      res <- sapply(phrases11, grepl, x = as.character(x$name),ignore.case= 
      TRUE)
      rownames(res) <- x$name

      #replacement
      ones <- which(res==1, arr.ind=T)
      res[ones]<-colnames(res)[ones[,2]]
      res

      Output:
                         Hackathon   Hands on     Test     java   
     Java Hackathon     "Hackathon"   "FALSE"    "FALSE"  "java" 
     Intro to Graphs    "FALSE"       "FALSE"    "FALSE"  "FALSE"
     Hands on Cypher    "FALSE"     "Hands on"   "FALSE"  "FALSE"

我不想要&#34;测试&#34;要创建的列,因为我有大量的匹配数据。所以基本上,我们可以在res <- sapply(phrases11, grepl, x = as.character(x$name), ignore.case = TRUE)中进行一些代码更改,以便它只应创建我们匹配来自&#39; toMatch&#39;向量?还有其他方法吗?

1 个答案:

答案 0 :(得分:0)

由于您使用的grepl()函数给出了true或false,因此可以使用sum = 0来消除列:

  A=sapply(toMatch,grepl,as.character(x$name),ignore.case=T)
  A[,colSums(A)==1]
     Hackathon Hands on  java
[1,]      TRUE    FALSE  TRUE
[2,]     FALSE    FALSE FALSE
[3,]     FALSE     TRUE FALSE