Question

我有这个html字符串，其中包含一个我需要抓取的特定网址。

$string = 'Hi, this is a long string,
<br>
some more text, and then suddenly, a script tag!
<script type="text/javascript" src="http://www.example.com/static/123456/js/SiteCatalyst.js"></script>
<p>more text here</p>
<script type="text/javascript" src="http://www.example.com/other.js"></script>
and then, the end...';

问题是，我需要 123456 值，它恰好位于此字符串中;

http://www.example.com/static/xxxxxx/js/SpecificScript.js

由于该值可以（并且将会）在字符串中发生变化，因此需要对其进行语音分析。我的第一个方法是找到字符串中的所有网址，但如果有很多网址，那可能会太贵。

TL; DR ：如何获取url中的 xxxxxx 值，该值位于较大的html字符串中？

Answer 1

http:\/\/www\.example\.com\/static\/(\d+)\/js

试试这个。抓住捕获。参见演示。

https://regex101.com/r/nL5yL3/1

Answer 2

使用\K忽略之前在匹配时打印的匹配字符。

src="http://www.example.com/static/\K\d+(?=\/)

DEMO

$string = 'Hi, this is a long string,
<br>
some more text, and then suddenly, a script tag!
<script type="text/javascript" src="http://www.example.com/static/123456/js/SiteCatalyst.js"></script>
<p>more text here</p>
<script type="text/javascript" src="http://www.example.com/other.js"></script>
and then, the end...';
preg_match('~src="http://www.example.com/static/\K\d+(?=\/)~', $string, $matches);
echo $matches[0];

输出：

Answer 3

使用环顾四周

(?<=http:\/\/www.example.com\/static\/)\d+(?=\/js)

Regex Example

Answer 4

您可以使用php preg_match_all()并获取一个数字

$string = 'Hi, this is a long string,
<br>
some more text, and then suddenly, a script tag!
<script type="text/javascript" src="http://www.example.com/static/123456/js/SiteCatalyst.js"></script>
<p>more text here</p>
<script type="text/javascript" src="http://www.example.com/other.js"></script>
and then, the end...';

preg_match_all('!\d+!', $string, $matches);
echo $matches[0][0]; //output 123456

使用PHP从较大字符串中的URL获取查询字符串值

4 个答案: