Question

我有很多话语，每个话语中的各个位置都包含“ well”一词。以下是一些说明性数据：

data <- c("well what the church meeting 's got to decide",
        "oh well yes those are those are normal things",
        "well they 've sent you a letter from hospital",
        "and i think well you cheeky sod you know",
        "'cos she 's well that day albert took me",
        "yeah well you 're going out anyway so you")

我想提取满足否定位置标准的那些话：“好”不是第一个或第二个单词在话语中。预期的结果是这样：

data <- c("and i think well you cheeky sod you know",
        "'cos she 's well that day albert took me")

这种模式让我不想要提取的内容

grep("^well|^\\w*\\swell", data, perl = T, value = T)
[1] "well what the church meeting 's got to decide" "oh well yes those are those are normal things"
[3] "well they 've sent you a letter from hospital" "yeah well you 're going out anyway so you"

现在的诀窍是否定这种模式。我已经尝试过否定的前瞻，但它不起作用：

grep("(?!^well|^\\w*\\swell)", data, perl = T, value = T)
[1] "well what the church meeting 's got to decide" "oh well yes those are those are normal things"
[3] "well they 've sent you a letter from hospital" "and i think well you cheeky sod you know"     
[5] "'cos she 's well that day albert took me"      "yeah well you 're going out anyway so you"

R中的哪个正则表达式将执行所需的提取？预先感谢。

Answer 1

您可以使用invert=TRUE来反转grep的结果，并且可以稍微简化您的模式：

> data <- c("well what the church meeting 's got to decide",
+         "oh well yes those are those are normal things",
+         "well they 've sent you a letter from hospital",
+         "and i think well you cheeky sod you know",
+         "'cos she 's well that day albert took me",
+         "yeah well you 're going out anyway so you")
> grep("^\\s*(?:\\w+\\s+)?well\\b", data, value=TRUE, invert=TRUE)
[1] "and i think well you cheeky sod you know"
[2] "'cos she 's well that day albert took me"

无需使用PCRE引擎来运行此模式。

正则表达式详细信息

^-字符串的开头
\\s*-超过0个空格
(?:\\w+\\s+)?-非捕获组匹配：
- \\w+-1个以上的字符字符
- \\s+-超过1个空格
well\\b-整个单词well（\b是单词边界）。

R中的正则表达式提取满足否定位置条件的语音

1 个答案: