我正在尝试获取缩略图链接:
https://i.pinimg.com/236x/38/8f/c9/388fc91621d9d12db3d1211b39ab0fc1--flying-dog-pure-joy.jpg
但由于某些原因,getElementsByTagName没有返回我想要的内容。
$newdom=new DOMDocument();
$xml=simplexml_load_file("https://www.pinterest.co.uk/sucastro/animals.rss");
$newdom->loadXML($xml);
$out=$newdom->getElementsByTagName('img');
print_r($out);
我也试过
$out=$newdom->channel->item->description->getElementsByTagName('img');
也失败了。
SimpleXMLElement Object
(
[@attributes] => Array
(
[version] => 2.0
)
[channel] => SimpleXMLElement Object
(
[title] => ANIMALS
[link] => https://www.pinterest.com/sucastro/animals/
[description] => SimpleXMLElement Object
(
)
[language] => en-us
[lastBuildDate] => Fri, 48 Jan 2017 33:33:33 +0000
[item] => Array
(
[0] => SimpleXMLElement Object
(
[title] => Hi ladies. Let's pin
[link] => https://www.pinterest.com/pin/209628557639623067/
[description] => <a href="/pin/209628557639623067/"><img src="https://i.pinimg.com/236x/38/8f/c9/388fc91621d9d12db3d1211b39ab0fc1--flying-dog-pure-joy.jpg"></a>Hi ladies. Let's pin GREEN AND WHITE today ❤️
[pubDate] => Fri, 08 Sep 2017 20:04:02 +0000
[guid] => https://www.pinterest.com/pin/209628557639623067/
)
我已经环顾了几个小时尝试不同的事情,但由于某种原因,它仍然无效。
答案 0 :(得分:0)
这只是从rss频道的描述中获取图像。
你可以试试这个:
<?php
$xml=simplexml_load_file("https://www.pinterest.co.uk/sucastro/animals.rss");
foreach ($xml->channel->item as $item) {
preg_match('/<img.+src=[\'"](?P<src>.+?)[\'"].*>/i', $item->description, $image);
echo($image['src']);
echo "<br/>";
}
?>
输出
https://i.pinimg.com/236x/38/8f/c9/388fc91621d9d12db3d1211b39ab0fc1--flying-dog-pure-joy.jpg
https://i.pinimg.com/236x/d7/1f/dc/d71fdc7f19d8c4b896840e4d8c65642f.jpg
https://i.pinimg.com/236x/94/2a/fb/942afb5bab44f6f3b4b9ea4f35690c54--national-forest-cute-photos.jpg
https://i.pinimg.com/236x/cc/3c/26/cc3c2658f23571a8eb34a0e34d71e80a--majestic-animals-lion-cub.jpg
https://i.pinimg.com/236x/98/1f/c6/981fc6e5e1e332e0067c1b6748fce20a--swan-lake-birds-.jpg
https://i.pinimg.com/236x/86/90/8d/86908dd2ce9b8c73648cdcbc5d3325e1--white-bunnies-white-rabbits.jpg
https://i.pinimg.com/236x/e0/7b/ce/e07bcea2ec026b5900afd0dc0c35ef71--autumn-leaves-west-highland-terrier.jpg
https://i.pinimg.com/236x/8f/87/e9/8f87e95b0384a0f227c9078d053cc202--new-friends-my-friend.jpg
https://i.pinimg.com/236x/5e/8f/3b/5e8f3bc079c7617d8cf3a91568f402cd--love-my-dog-puppy-love.jpg
https://i.pinimg.com/236x/9f/6a/79/9f6a7948c1cf3d1fbc067d52691bbc7d--water-fountains-bird-baths.jpg
https://i.pinimg.com/236x/07/09/d0/0709d0b16178da14facd8401a8472a3d--dachshund-humor-dapple-dachshund.jpg
https://i.pinimg.com/236x/2e/c7/af/2ec7af9628a30ec3901c3d14d2b6dc08--animal-portraits-animals-photos.jpg
https://i.pinimg.com/236x/06/2e/4b/062e4b396631ae6fc52a64c2f32d8022--cat-whiskers-exotic-cats.jpg
https://i.pinimg.com/236x/5e/5c/d6/5e5cd648d4efb29b82607445b795d357--teacups-tea-time.jpg
https://i.pinimg.com/236x/dc/30/ce/dc30ce6b41970dc93f1a3607cb48ab2a--pretty-birds-beautiful-birds.jpg
https://i.pinimg.com/236x/80/63/e1/8063e102ba108b0f6531065bdf50687c--bear-costume-the-zoo.jpg
https://i.pinimg.com/236x/c6/f3/1e/c6f31efadca1b2f71b898a85c46a6f87--baby-polar-bears-panda-bears.jpg
https://i.pinimg.com/236x/c2/13/c3/c213c317cc750ac5fef00c4ebd1a1b71--elephant-love-baby-elephants.jpg
https://i.pinimg.com/236x/49/75/d8/4975d85611c291c50db1ee57f6018bef.jpg
https://i.pinimg.com/236x/6b/cd/35/6bcd353ac7250d4159d3c74fe0d75747--beautiful-swan-beautiful-birds.jpg
https://i.pinimg.com/236x/1f/fa/4c/1ffa4c7d4f977a9c43c5655a4a3ea0ab--fox-baby-pet-fox.jpg
https://i.pinimg.com/236x/1c/9e/e1/1c9ee1aba651fde0473a4abcc8cab6c4--baby-cheetahs-travel-photography.jpg
https://i.pinimg.com/236x/fd/cd/40/fdcd40daae027c7f51298a543a015791--baby-caracal-baby-bobcat.jpg
https://i.pinimg.com/236x/cc/75/b7/cc75b77d9cea128fcab6daa1d57826af--leopard-animal-baby-leopard.jpg
https://i.pinimg.com/236x/3c/93/29/3c9329976aa40675483e026f4ac5baa6--animaux-totems-can-to.jpg
答案 1 :(得分:0)
你的事情有点混乱。您将Feed添加到SimpleXMLElement
,但随后尝试将其加载到DOMDocument
。
您需要做的就是抓住SimpleXMLElement
,遍历item
,然后获取他们的描述。请注意,无法使用XPath或此处相关(除了使其成为另一个SimpleXMLElement)以获取img
的来源,因为现在您正在处理只是元素的价值。
那么,你可以在这里做的是查看是否有图像,如果有,则提取源:
<?php
$newdom=new DOMDocument();
$xml=simplexml_load_file("https://www.pinterest.co.uk/sucastro/animals.rss");
foreach ($xml->channel->item as $item) {
$desc = $item->description;
if (preg_match("/<img src=\"(.*?)\"/i", $desc, $m)) {
echo "Image: ".$m[1]."<br>";
}
}
答案 2 :(得分:0)
坚持使用XML和XPath,这是可能的,但你必须在阅读时解码每个块 - 所以它不如它本来那么高效。 代码读取RSS提要,然后提取所有描述元素。
然后解码编码的HTML并从每个段中获取src属性。
<?php
error_reporting ( E_ALL );
ini_set ( 'display_errors', 1 );
libxml_use_internal_errors();
$doc = simplexml_load_file("https://www.pinterest.co.uk/sucastro/animals.rss");
$items = $doc->xpath("//item/description");
foreach ( $items as $item ) {
$doc = new DOMDocument();
$doc->loadHTML(html_entity_decode($item));
$img = $doc->getElementsByTagName('img')->item(0);
echo $img->getAttribute('src').PHP_EOL;
}
此输出(截断空格)...
https://i.pinimg.com/236x/38/8f/c9/388fc91621d9d12db3d1211b39ab0fc1--flying-dog-pure-joy.jpg
https://i.pinimg.com/236x/d7/1f/dc/d71fdc7f19d8c4b896840e4d8c65642f.jpg
https://i.pinimg.com/236x/94/2a/fb/942afb5bab44f6f3b4b9ea4f35690c54--national-forest-cute-photos.jpg
https://i.pinimg.com/236x/cc/3c/26/cc3c2658f23571a8eb34a0e34d71e80a--majestic-animals-lion-cub.jpg
https://i.pinimg.com/236x/98/1f/c6/981fc6e5e1e332e0067c1b6748fce20a--swan-lake-birds-.jpg
https://i.pinimg.com/236x/86/90/8d/86908dd2ce9b8c73648cdcbc5d3325e1--white-bunnies-white-rabbits.jpg
https://i.pinimg.com/236x/e0/7b/ce/e07bcea2ec026b5900afd0dc0c35ef71--autumn-leaves-west-highland-terrier.jpg
https://i.pinimg.com/236x/8f/87/e9/8f87e95b0384a0f227c9078d053cc202--new-friends-my-friend.jpgls-lion-cub.jpg
https://i.pinimg.com/236x/98/1f/c6/981fc6e5e1e332e0067c1b6748fce20a--swan-lake-birds-.jpg
https://i.pinimg.com/236x/86/90/8d/86908dd2ce9b8c73648cdcbc5d3325e1--white-bunnies-white-rabbits.jpg
https://i.pinimg.com/236x/e0/7b/ce/e07bcea2ec026b5900afd0dc0c35ef71--autumn-leaves-west-highland-terrier.jpg
https://i.pinimg.com/236x/8f/87/e9/8f87e95b0384a0f227c9078d053cc202--new-friends-my-friend.jpg