代码

Question

使用PHP，如何从$ foo中隔离src属性的内容？我正在寻找的最终结果只会给我“http://example.com/img/image.jpg”

$foo = '<img class="foo bar test" title="test image" src="http://example.com/img/image.jpg" alt="test image" width="100" height="100" />';

Answer 1

如果您不想使用正则表达式（或任何非标准PHP组件），使用内置DOMDocument class的合理解决方案如下：

<?php
    $doc = new DOMDocument();
    $doc->loadHTML('<img src="http://example.com/img/image.jpg" ... />');
    $imageTags = $doc->getElementsByTagName('img');

    foreach($imageTags as $tag) {
        echo $tag->getAttribute('src');
    }
?>

Answer 2

代码

<?php
    $foo = '<img class="foo bar test" title="test image" src="http://example.com/img/image.jpg" alt="test image" width="100" height="100" />';
    $array = array();
    preg_match( '/src="([^"]*)"/i', $foo, $array ) ;
    print_r( $array[1] ) ;

输出

http://example.com/img/image.jpg

Answer 3

// Create DOM from string
$html = str_get_html('<img class="foo bar test" title="test image" src="http://example.com/img/image.jpg" alt="test image" width="100" height="100" />');

// echo the src attribute
echo $html->find('img', 0)->src;

http://simplehtmldom.sourceforge.net/

Answer 4

我收到了这段代码：

$dom = new DOMDocument();
$dom->loadHTML($img);
echo $dom->getElementsByTagName('img')->item(0)->getAttribute('src');

假设只有一个img：P

Answer 5

我迟到了，但我还没有提到一个简单的解决方案。加载simplexml_load_string（如果您启用了simplexml），然后将其翻转到json_encode和json_decode。

$foo = '<img class="foo bar test" title="test image" src="http://example.com/img/image.jpg" alt="test image" width="100" height="100" />';

$parsedFoo = json_decode(json_encode(simplexml_load_string($foo)), true);
var_dump($parsedFoo['@attributes']['src']); // output: "http://example.com/img/image.jpg"

$parsedFoo来自

array(1) {
  ["@attributes"]=>
  array(6) {
    ["class"]=>
    string(12) "foo bar test"
    ["title"]=>
    string(10) "test image"
    ["src"]=>
    string(32) "http://example.com/img/image.jpg"
    ["alt"]=>
    string(10) "test image"
    ["width"]=>
    string(3) "100"
    ["height"]=>
    string(3) "100"
  }
}

我已经使用它来解析XML和HTML几个月了，它运行得很好。我还没有打嗝，虽然我没有用它来解析一个大文件（我想像使用json_encode和json_decode那样输入得到的越大就越慢。这是令人费解的，但它是迄今为止读取HTML属性的最简单方法。

Answer 6

尝试这种模式：

'/< \s* img [^\>]* src \s* = \s* [\""\']? ( [^\""\'\s>]* )/'

Answer 7

这是我最终做的事情，虽然我不确定这是多么有效：

$imgsplit = explode('"',$data);
foreach ($imgsplit as $item) {
    if (strpos($item, 'http') !== FALSE) {
        $image = $item;
        break;
    }
}

Answer 8

preg_match很好地解决了这个问题。

请在此处查看我的回答：How to extract img src, title and alt from html using php?

Answer 9

您可以使用此功能解决此问题：


function getTextBetween($start, $end, $text)
{
 $start_from = strpos($text, $start);
 $start_pos = $start_from + strlen($start);
 $end_pos = strpos($text, $end, $start_pos + 1);
 $subtext = substr($text, $start_pos, $end_pos);
 return $subtext;
}

$foo = '<img class="foo bar test" title="test image" 
src="http://example.com/img/image.jpg" alt="test image"
width="100" height="100" />';

$img_src = getTextBetween('src="', '"', $foo);

Answer 10

我使用 preg_match_all 来捕获 HTML 文档中的所有图像：

preg_match_all("~<img.*src\s*=\s*[\"']([^\"']+)[\"'][^>]*>~i", $body, $matches);

这个允许更宽松的声明语法，带有空格和不同的引号类型。

Regex 读作 （任何属性，如 style 或 border）src（可能的空格）= (可能的空格) (' or ") (任何非引号) (' or ") (任何直到>) (>)

Answer 11

让我假设我使用

$text ='<img src="blabla.jpg" alt="blabla" />';

in

getTextBetween('src="','"',$text);

代码将返回：

blabla.jpg" alt="blabla"

这是错误的，我们希望代码在属性值引号之间返回文本，即attr =＆＃34; value＆＃34;。

所以

  function getTextBetween($start, $end, $text)
            {
                // explode the start string
                $first_strip= end(explode($start,$text,2));

                // explode the end string
                $final_strip = explode($end,$first_strip)[0];
                return $final_strip;
            }

诀窍！。

尝试

   getTextBetween('src="','"',$text);

将返回：

blabla.jpg

非常感谢，因为您的解决方案让我对最终解决方案有所了解。

正则表达式PHP - 从img标签中隔离src属性

11 个答案:

代码

输出