我是正则表达式的新手我为了在锚标记html中获取图像标记值而尝试了很多 这是我的html表达
<div class="smallSku" id="ctl00_ContentPlaceHolder1_smallImages">
<a title="" name="http://www.playg.in/productImages/med/PNC000051_PNC000051.jpg" href="http://www.playg.in/productImages/lrg/PNC000051_PNC000051.jpg" onclick="return showPic(this)" onmouseover="return showPic(this)">
<img border="0" alt="" src="http://www.playg.in/productImages/thmb/PNC000051_PNC000051.jpg"></a> <a title="PNC000051_PNC000051_1.jpg" name="http://www.playg.in/productImages/med/PNC000051_PNC000051_1.jpg" href="http://www.playg.in/productImages/lrg/PNC000051_PNC000051_1.jpg" onclick="return showPic(this)" onmouseover="return showPic(this)">
<img border="0" alt="PNC000051_PNC000051_1.jpg" src="http://www.playg.in/productImages/thmb/PNC000051_PNC000051_1.jpg"></a>
</div>
我想只返回图片标签的src值,我尝试了“preg_match_all()”中的匹配模式,模式是
"@<div[\s\S]class="smallSku"[\s\S]id="ctl00_ContentPlaceHolder1_smallImages"\><a title=\"\" name="[\w\W]" href="[\w\W]" onclick=\"[\w\W]" onmouseover="[\w\W]"\><img[\s\S]src="(.*)"[\s\S]></a><\/div>@"
请帮助我尝试了很多时间,也尝试使用此链接Match image tag not nested in an anchor tag using regular expression
答案 0 :(得分:5)
正则表达式不是解析HTML的正确工具。请参阅此常见问题解答:How to parse and process HTML/XML?
以下是如何使用您的示例获取src
属性的示例:
$doc = new DOMDocument();
$doc->loadHTML($your_html_string);
$xpath = new DOMXPath($doc);
foreach ($xpath->query('//div[@class="smallSku"]/a/img/@src') as $attr) {
$src = $attr->value;
print $src;
}
答案 1 :(得分:2)
尝试这个太阳
$content = file_get_contents('your url');
preg_match_all("|<div class='items'>.*</div>|", $content, $arr, PREG_PATTERN_ORDER);
preg_match_all("/src='([^']+)'/", $arr[0][0], $arrr, PREG_PATTERN_ORDER);
echo '<pre>';
print_r($arrr);