R中的正则表达式提取满足否定位置条件的语音

时间:2019-01-30 18:39:22

标签: r regex

我有很多话语,每个话语中的各个位置都包含“ well”一词。以下是一些说明性数据:

data <- c("well what the church meeting 's got to decide",
        "oh well yes those are those are normal things",
        "well they 've sent you a letter from hospital",
        "and i think well you cheeky sod you know",
        "'cos she 's well that day albert took me",
        "yeah well you 're going out anyway so you")

我想提取满足否定 位置 标准的那些话:“好”不是第一个或第二个单词在话语中。预期的结果是这样:

data <- c("and i think well you cheeky sod you know",
        "'cos she 's well that day albert took me")

这种模式让我想要提取的内容

grep("^well|^\\w*\\swell", data, perl = T, value = T)
[1] "well what the church meeting 's got to decide" "oh well yes those are those are normal things"
[3] "well they 've sent you a letter from hospital" "yeah well you 're going out anyway so you"    

现在的诀窍是否定这种模式。我已经尝试过否定的前瞻,但它不起作用:

grep("(?!^well|^\\w*\\swell)", data, perl = T, value = T)
[1] "well what the church meeting 's got to decide" "oh well yes those are those are normal things"
[3] "well they 've sent you a letter from hospital" "and i think well you cheeky sod you know"     
[5] "'cos she 's well that day albert took me"      "yeah well you 're going out anyway so you"

R中的哪个正则表达式将执行所需的提取?预先感谢。

1 个答案:

答案 0 :(得分:1)

您可以使用invert=TRUE来反转grep的结果,并且可以稍微简化您的模式:

> data <- c("well what the church meeting 's got to decide",
+         "oh well yes those are those are normal things",
+         "well they 've sent you a letter from hospital",
+         "and i think well you cheeky sod you know",
+         "'cos she 's well that day albert took me",
+         "yeah well you 're going out anyway so you")
> grep("^\\s*(?:\\w+\\s+)?well\\b", data, value=TRUE, invert=TRUE)
[1] "and i think well you cheeky sod you know"
[2] "'cos she 's well that day albert took me"

无需使用PCRE引擎来运行此模式。

正则表达式详细信息

  • ^-字符串的开头
  • \\s*-超过0个空格
  • (?:\\w+\\s+)?-非捕获组匹配:
    • \\w+-1个以上的字符字符
    • \\s+-超过1个空格
  • well\\b-整个单词well\b是单词边界)。