我应该如何获得两个并不总是相同的html标签之间的文本。我应该如何让正则表达式“忽略”一个部分。
让我们说这是我的HTML:
<html>
...
<span id="ctl00_ContentPlaceHolder1_gvDomain_ctl03_lblName">stirng 1</span>
...
<span id="ctl00_ContentPlaceHolder1_gvDomain_ctl04_lblName">string 2</span>
...
<span id="ctl00_ContentPlaceHolder1_gvDomain_ctl53_lblName">string 3</span>
...
</html>
如您所见,ctlxx部分并不总是相同,此代码仅获取第一个字符串:
preg_match('#\\<span id="ctl00_ContentPlaceHolder1_gvDomain_ctl03_lblName">(.+)\\</span>#s',$html,$matches);
$match = $matches[0];
echo $match;
如何让正则表达式忽略ctlxx部分并回显所有字符串?
提前致谢
答案 0 :(得分:0)
您可以使用preg_match
通过DomDocument和DomXpath执行此操作$dom = new DOMDocument();
$dom->loadHTML($str);
$x = new DOMXpath($dom);
// Next two string to use Php functions within within Xpath expression
$x->registerNamespace("php", "http://php.net/xpath");
$x->registerPHPFunctions();
// Select span tags with proper id
foreach($x->query('//span[php:functionString("preg_match", "/ctl00_ContentPlaceHolder1_gvDomain_ctl\d+_lblName/", .)]') as $node)
echo $node->nodeValue;
答案 1 :(得分:0)
如果你想使用正则表达式来解决它,那么你可以做这样的事情
<?php
preg_match('/<span id="[^"]*">(.+)<\/span>/is',$html,$matches);
$match = $matches[0];
echo $match;