从文本中提取图像src?

时间:2012-07-11 19:46:16

标签: php

我有一个变量$ content,其中包含这种形式的一些文本和图像(未知数量的图像):

    text text text text <img src="path/to/image/1">text text text text
    <img src="path/to/image/2">
text text text text text text text text text text text text text text text text <img src="path/to/image/3"><img src="path/to/image/4">text text text text 
<img src="path/to/image/5">

我想提取所有图像src并使用php将它们存储在数组中:

array(
[1]="path/to/image/1"
[2]="path/to/image/2"
[3]="path/to/image/3"
[4]="path/to/image/4"
[5]="path/to/image/5"
.
.
.
)

做这样的事情的最佳方式是什么。我已经尝试过爆炸功能,但这种方式似乎效率低下。

3 个答案:

答案 0 :(得分:6)

    $dom = new domDocument;
    $dom->loadHTML($html);
    $dom->preserveWhiteSpace = false;
    $imgs  = $dom->getElementsByTagName("img");
    $links = array();
    for($i = 0; $i < $imgs->length; $i++) {
       $links[] = $imgs->item($i)->getAttribute("src");
    }

答案 1 :(得分:3)

以下是使用simplehtmldom的示例:

include("simple_html_dom.php");
$content = '
text text text text <img src="path/to/image/1">text text text text
    <img src="path/to/image/2">
text text text text text text text text text text text text text text text text <img src="path/to/image/3"><img src="path/to/image/4">text text text text 
<img src="path/to/image/5"> ';

$html = str_get_html($content);
$images = $html->find("img");
$links = array();
foreach($images as $image) {
  $links[] = $image->src;
}

print_r($links);

输出:

Array
(
    [0] => path/to/image/1
    [1] => path/to/image/2
    [2] => path/to/image/3
    [3] => path/to/image/4
    [4] => path/to/image/5
)

答案 2 :(得分:-1)

使用正则表达式:

<?php

$str = '    text text text text <img src="path/to/image/1">text text text text
    <img src="path/to/image/2">
text text text text text text text text text text text text text text text text <img src="path/to/image/3"><img src="path/to/image/4">text text text text
<img src="path/to/image/5">';


preg_match_all('@<img.*src="([^"]*)"[^>/]*/?>@Ui', $str, $out);

print_r($out[1]);

?>

输出:

Array
(
    [0] => path/to/image/1
    [1] => path/to/image/2
    [2] => path/to/image/3
    [3] => path/to/image/4
    [4] => path/to/image/5
)