Question

我想从字符串中检索第一个数字（此处为＆gt; 344002）：

string <- '<a href="/Archiv-Suche/!344002&amp;s=&amp;SuchRahmen=Print/" ratiourl-ressource="344002"'

我最好找一个正则表达式，它在查找后面的数字！在＆amp; amp。之前

我想出的就是这个，但是这就抓住了！同样（！344002）：

regmatches(string, gregexpr("\\!([[:digit:]]+)", string, perl =TRUE))

有什么想法吗？

Answer 1

使用this regex：

(?<=\!)\d+(?=&amp)

使用此代码：

regmatches(string, gregexpr("(?<=\!)\d+(?=&amp)", string, perl=TRUE))

(?<=\!)是一个后视，匹配将在!
\d+匹配一位或多位

(?=&amp)

&amp会停止匹配

Answer 2

library(gsubfn)
strapplyc(string, "!(\\d+)")[[1]]

旧答案]

测试此代码。

library(stringr)
str_extract(string, "[0-9]+")

这里有类似的问题和答案

Extract a regular expression match in R version 2.10

Answer 3

您可以捕获 \d+和!之间的数字（&amp）并通过regexec / regmatches获取：

> string <- '<a href="/Archiv-Suche/!344002&amp;s=&amp;SuchRahmen=Print/" ratiourl-ressource="344002"'
> pattern = "!(\\d+)&amp;"
> res <- unlist(regmatches(string,regexec(pattern,string)))
> res[2]
[1] "344002"

请参阅online R demo

R：如何从字符串中提取特定数字？

3 个答案: