我有一个数据字符串设置为$ content,此数据的示例如下
This is some sample data which is going to contain an image in the format <img src="http://www.randomdomain.com/randomfolder/randomimagename.jpg">. It will also contain lots of other text and maybe another image or two.
我试图抓住<img src="http://www.randomdomain.com/randomfolder/randomimagename.jpg">
并将其保存为另一个字符串,例如$ extracted_image
到目前为止,我有这个......
if( preg_match_all( '/<img[^>]+src\s*=\s*["\']?([^"\' ]+)[^>]*>/', $content, $extracted_image ) ) {
$new_content .= 'NEW CONTENT IS '.$extracted_image.'';
它返回的全部是......
NEW CONTENT IS Array
我意识到我的尝试可能完全错误,但有人可以告诉我哪里出错了吗?
答案 0 :(得分:1)
您的第一个问题是http://php.net/manual/en/function.preg-match-all.php将数组放入$matches
,因此您应该从数组中输出单个项目。尝试$extracted_image[0]
开始。
答案 1 :(得分:1)
如果您只想要一个结果,则需要使用其他功能:
preg_match()
返回第一个也是唯一一个匹配。
preg_match_all()
返回包含所有匹配项的数组。
答案 2 :(得分:0)
使用正则表达式解析有效的html是不明智的。由于src属性之前可能有意外的属性,因为非img标签可以将正则表达式欺骗成假阳性匹配,并且由于属性值可以用单引号或双引号引起来,因此您应该使用dom解析器。它干净,可靠且易于阅读。
代码:(Demo)
$string = <<<HTML
This is some sample data which is going to contain an image
in the format <img src="http://www.randomdomain.com/randomfolder/randomimagename.jpg">.
It will also contain lots of other text and maybe another image or two
like this: <img alt='another image' src='http://www.example.com/randomfolder/randomimagename.jpg'>
HTML;
$srcs = [];
$dom=new DOMDocument;
$dom->loadHTML($string);
foreach ($dom->getElementsByTagName('img') as $img) {
$srcs[] = $img->getAttribute('src');
}
var_export($srcs);
输出:
array (
0 => 'http://www.randomdomain.com/randomfolder/randomimagename.jpg',
1 => 'http://www.example.com/randomfolder/randomimagename.jpg',
)