Question

我试着在这里关注一些关于preg_match和DOM的问题，但是一切都飞到了我的头上。

我有一个这样的字符串：

$string = '<td class="borderClass" width="225" style="border-width: 0 1px 0 0;" valign="top">
<div style="text-align: center;">
    <a href="http://myanimelist.net/anime/10800/Chihayafuru/pic&pid=35749">
    <img src="http://cdn.myanimelist.net/images/anime/3/35749.jpg" alt="Chihayafuru" align="center">
    </a>
</div>';

我现在正试图从中获取图像src属性值。我尝试使用这段代码，但我无法弄清楚我做错了什么。

$doc = new DOMDocument();
$dom->loadXML( $string );
$imgs = $dom->query("//img");
for ($i=0; $i < $imgs->length; $i++) {
    $img = $imgs->item($i);
    $src = $img->getAttribute("src");
}
$scraped_img = $src;

我如何使用php获取图像src属性？

Answer 1

以下是您可以使用的更正代码：

$string = '<td class="borderClass" width="225" style="border-width: 0 1px 0 0;" valign="top">
<div style="text-align: center;">
    <a href="http://myanimelist.net/anime/10800/Chihayafuru/pic&pid=35749">
    <img src="http://cdn.myanimelist.net/images/anime/3/35749.jpg" alt="Chihayafuru" align="center">
    </a>
</div>';

$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML( $string );
$xpath = new DOMXPath($doc);
$imgs = $xpath->query("//img");
for ($i=0; $i < $imgs->length; $i++) {
    $img = $imgs->item($i);
    $src = $img->getAttribute("src");
}

echo $src;

输出

http://cdn.myanimelist.net/images/anime/3/35749.jpg

Answer 2

我们在编写Drupal时发现，使用SimpleXML比处理DOM要容易得多：

$htmlDom = new \DOMDocument();
@$htmlDom->loadHTML('<?xml encoding="UTF-8">' . $string);
$elements = simplexml_import_dom($htmlDom);
print $elements->body->td[0]->div[0]->a[0]->img[0]['src'];

这允许你加载任何HTML汤，因为DOM比simplexml更宽容，同时允许使用简单而强大的simplexml扩展。

前三行是从Drupal测试框架中复制的verbatin - 它是真正的战斗硬化代码。

Answer 3

stages: ["build", "test", "deploy"]

before_script:
    - npm install
    - CI=false

build:
    stage: build
    script: npm run build

test:
    stage: test
    script: npm run test

deploy:
    stage: deploy
    script: npm run start

如何从php字符串中获取图像src属性值？

3 个答案:

输出