用于从锚标记获取图像源的PHP正则表达式

时间:2011-10-10 06:36:56

标签: php regex

我需要帮助才能从图像标签中获取图像源,该标签包含在锚标记中,例如

<p>this is sample text <a href="a link to some site"><img src="imagesource" height='x' width='y' /></a><p>

从上面的文本字符串我想要检索图像源。

这里是实际的描述,我必须从中提取仅第一个图像标记的来源

$desc = "<p></p><p><a href='http://somesitename.com/wp-content/uploads/2007/08/100_2666.JPG' title='100_2666.JPG'><img width=\"400\" src='http://somesitename.com/wp-content/uploads/2007/08/100_2666.JPG' alt='100_2666.JPG' /></a></p><p><a href='http://somesitename.com/wp-content/uploads/2007/08/100_2667.JPG' title='100_2667.JPG'><img width=\"400\" src='http://somesitename.com/wp-content/uploads/2007/08/100_2667.JPG' alt='100_2667.JPG' /></a></p><p>These are some of the variations of cotton floral prints used by the Knickerbocker Toy Co. in the 1960's. I constantly search for more examples of the unusual early prints and sometimes have to purchase a doll in fair condition just to have her dress!! See the article about the Knickerbocker Anns that follows.</p>";

$imgsrc_regex = '/<a (.*)><img.+?src=(\'|")(.+?)(\'|")[^>]*><(.*\/)*a>/';
preg_match($imgsrc_regex, $desc, $arr_match_array);

上面的arr_match_array返回所有锚图像标签,因为我只想获得第一个

2 个答案:

答案 0 :(得分:0)

$x = <<<_A_
<p>this is sample text <a href="a link to some site"><img src="imagesource" height='x' width='y' /></a></p>
_A_;
$s = simplexml_load_string($x);
$images = $s->xpath('//img');
echo $images[0]['src'];

答案 1 :(得分:0)

/<\s*?img\s+[^>]*?\s*src\s*=\s*(["'])[[\\?+.]*?]\1[^>]*?>/

虽然我建议你不要使用正则表达式,因为HTML代码可能不是100%兼容。尝试使用HTML解析器,如:

http://people.virginia.edu/~rtg2t/rda/ref.manual.html