Question

我想提取字符串“ mystr”和其他内容之后的所有数字。例如，如果我有字符串。

x <- "This is mystring hola 8 and this yourstring hola 9 and again mystrings op 12."

它应该返回8和12。在R中，我尝试过：

stringr::str_extract_all(x, "mystr.*\\d+")

Answer 1

您可以使用{p>提取mystr之后最接近的数字块

x <- "This is mystring hola 8 and this yourstring hola 9 and again mystrings op 12."
regmatches(x, gregexpr("mystr.*?\\K\\d+", x, perl=TRUE))
# => [[1]]
#    [1] "8"  "12"

请参见R demo

此PCRE正则表达式将匹配

mystr-mystr
.*?-除换行符以外的任何0+个字符都应尽可能少
\\K-将省略到目前为止匹配的文本
\\d+-1个以上的数字。

请参见PCRE regex demo。

如果您想使用stringr，则可以使用str_match_all：

> library(stringr)
> x <- "This is mystring hola 8 and this yourstring hola 9 and again mystrings op 12."
> str_match_all(x, "mystr.*?(\\d+)")[[1]][,2]
[1] "8"  "12"

将数字捕获到第1组中。

Answer 2

有时候str_match比str_extract更灵活：

library(stringr)
str_match_all("This is mystring hola 8 and this yourstring hola 9 and again mystrings op 12.", 
              "mystring.*?(\\d+)")[[1]][, 2]

[1] "8"  "12"

R中使用正则表达式的某些字符串模式之后的所有第一个数字

2 个答案: