如何在同一个正则表达式中搜索多个单词?

时间:2018-02-24 20:14:34

标签: r regex lapply gsub

我有一个特定单词列表,用于删除句子列表。我是否必须遍历列表并将函数应用于每个正则表达式,或者我可以以某种方式一次性调用它们?我试图用lapply这样做,但我希望找到更好的方法。

 string <- 'This is a sample sentence from which to gather some cool 
 knowledge'

 words <- c('a','from','some')

lapply(words,function(x){
  string <- gsub(paste0('\\b',words,'\\b'),'',string)
})

我想要的输出是: This is sample sentence which to gather cool knowledge.

2 个答案:

答案 0 :(得分:3)

您可以使用正则表达式运算符("|")折叠要删除的单词的字符向量,有时也称为“管道”符号。

gsub(paste0('\\b',words,'\\b', collapse="|"), '', string)
[1] "This is  sample sentence  which to gather  cool \n knowledge"

或者:

gsub(paste0('\\b',words,'\\b\\s{0,1}', collapse="|"), '', string)
[1] "This is sample sentence which to gather cool \n knowledge"

答案 1 :(得分:0)

您需要使用"|"来使用或使用正则表达式:

string2 <- gsub(paste(words,'|',collapse =""),'',string)

> string2
[1] "This is sample sentence which to gather cool knowledge"