我有以下html标记:
<ul>
<li>
<strong>Online:</strong>
2/14/2010 3:40 AM
</li>
<li>
<strong>Hearing Impaired:</strong>
No
</li>
<li>
<strong>Downloads:</strong>
3,840
</li>
</ul>
我希望从上次3,840
li
抓住"Downloads:"
。
你有什么建议?
我的尝试:
preg_match('/<li><strong>Downloads:<\/strong>(.*?)<\/li>/s', $s, $a);
答案 0 :(得分:3)
我建议在这里使用HTML解析器,DOMDocument
特别是使用xpath。
示例:
$markup = '<ul>
<li>
<strong>Online:</strong>
2/14/2010 3:40 AM
</li>
<li>
<strong>Hearing Impaired:</strong>
No
</li>
<li>
<strong>Downloads:</strong>
3,840
</li>
</ul>';
$dom = new DOMDocument();
$dom->loadHTML($markup);
$xpath = new DOMXpath($dom);
// this just simply means get the string next on that strong tag with a text of Downloads:
$download = trim($xpath->evaluate("string(//strong[text()='Downloads:']/following-sibling::text())"));
echo $download; // 3,840
答案 1 :(得分:1)
使用html解析器解析html文件。如果你坚持正则表达式,那么你可以尝试下面的,
<li>[^<>]*<strong>Downloads:<\/strong>\s*\K.*?(?=\s*<\/li>)
代码:
$string = <<<EOT
<ul>
<li>
<strong>Online:</strong>
2/14/2010 3:40 AM
</li>
<li>
<strong>Hearing Impaired:</strong>
No
</li>
<li>
<strong>Downloads:</strong>
3,840
</li>
</ul>
EOT;
$regex = "~<li>[^<>]*<strong>Downloads:<\/strong>\s*\K.*?(?=\s*<\/li>)~s";
if (preg_match($regex, $string, $m)) {
$yourmatch = $m[0];
echo $yourmatch;
} // 3,840