如何在PHP中使用卷曲方法从亚马逊Rss Feed获取图像路径

时间:2014-12-29 09:22:59

标签: php curl rss

我想在php Curl方法中从xml标签中获取图像....我正在尝试从亚马逊RSS提要中获取图像...这里是链接http://www.amazon.co.uk/gp/rss/bestsellers/electronics/560834/ref=zg_bs_560834_rsslink

这是我用于使用卷曲方法获取图像的PHP代码。但它没有工作......所以请帮助我

<?php 
$feed = "http://www.amazon.co.uk/gp/rss/bestsellers/electronics/560834/ref=zg_bs_560834_rsslink";
// Use cURL to fetch text
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $feed);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$rss = curl_exec($ch);
curl_close($ch);

// Manipulate string into object
$rss = simplexml_load_string($rss);
$siteTitle = $rss->channel->title;
$cnt = count($rss->channel->item);

for($i=0; $i<4; $i++)
{
    $url = $rss->channel->item[$i]->link;
    $title = $rss->channel->item[$i]->title;
    $desc = $rss->channel->item[$i]->description;
    echo '<h3><a href="'.$url.'">'.$title.'</a></h3>';
    $image = $rss->channel->item[$i]->description->img->attributes()->src;

    echo "Image Path : ".$image;
}

1 个答案:

答案 0 :(得分:0)

我没有在频道说明中看到图片网址。所以没有频道宽的图像开始。

<?xml version="1.0"?>
<rss version="2.0">
  <channel>
    <title>Amazon.co.uk: Bestsellers in Electronics &gt; Camera &amp; Photo</title>
    <link>http://www.amazon.co.uk/Best-Sellers-Electronics-Camera-Photo/zgbs/electronics/560834/ref=pd_zg_rss_ts_ce_560834_c</link>
    <description><![CDATA[The most popular items in Camera & Photo (Updated hourly). Note: Product prices and availability were accurate at the time this feed was generated but are subject to change.]]></description>
    <pubDate>Tue, 30 Dec 2014 04:54:42 GMT</pubDate>
    <lastBuildDate>Tue, 30 Dec 2014 04:54:42 GMT</lastBuildDate>
    <ttl>60</ttl>
    <generator>Amazon Community RSS 2.0</generator>
    <language>en-gb</language>
    <copyright>Copyright 2014, Amazon.com</copyright>    
    <item>
      ...snip...
    </item>
  </channel>
</rss>

现在,以下是我获取RSS副本时的第一项:

<item>
  <title>#1: Samsung UHS-I CLASS 10 32GB SD Micro Card with Adapter</title>
  <guid isPermaLink="false">top-sellers_electronics_560834_B00J29BR3Y</guid>
  <link>http://www.amazon.co.uk/Samsung-UHS-I-CLASS-Micro-Adapter/dp/B00J29BR3Y/ref=pd_zg_rss_ts_ce_560834_1</link>
  <pubDate>Tue, 30 Dec 2014 04:54:42 GMT</pubDate>
  <description><![CDATA[<div style="float:left;">
    <a class="url" href="http://www.amazon.co.uk/Samsung-UHS-I-CLASS-Micro-Adapter/dp/B00J29BR3Y/ref=pd_zg_rss_ts_ce_560834_1"><img src="http://ecx.images-amazon.com/images/I/41PsVtmh0KL._SL160_.jpg" alt="Samsung UHSI" border="0" hspace="0" vspace="0" /></a>
    </div><span class="riRssTitle">
    <a href="http://www.amazon.co.uk/Samsung-UHS-I-CLASS-Micro-Adapter/dp/B00J29BR3Y/ref=pd_zg_rss_ts_ce_560834_1">Samsung UHS-I CLASS 10 32GB SD Micro Card with Adapter</a>
    </span> <br /><span class="riRssContributor">by Samsung</span> <br />
    <img src="http://g-ecx.images-amazon.com/images/G/02/x-locale/common/icons/uparrow_green_trans._V192561975_.gif" width="13" align="abstop" alt="Ranking has gone up in the past 24 hours" title="Ranking has gone up in the past 24 hours" height="11" border="0" />
    <font color="green"><strong></strong></font> 183 days in the top 100 <br />
    <img src="http://g-ecx.images-amazon.com/images/G/02/detail/stars-4-5._V192253866_.gif" width="64" height="12" border="0" style="margin: 0; padding: 0;"/>(1432)
    <br /><br /><a href="http://www.amazon.co.uk/Samsung-UHS-I-CLASS-Micro-Adapter/dp/B00J29BR3Y/ref=pd_zg_rss_ts_ce_560834_1">Buy new:
    </a> <strike>£24.99</strike>  <font color="#990000"><b>£13.72</b></font>
    <br /><a href="http://www.amazon.co.uk/gp/offer-listing/B00J29BR3Y/ref=pd_zg_rss_ts_ce_560834_1">26 used & new</a>
    from <span class="price">£9.15</span><br /><br />(Visit the
    <a href="http://www.amazon.co.uk/Best-Sellers-Electronics-Camera-Photo/zgbs/electronics/560834/ref=pd_zg_rss_ts_ce_560834_1">Bestsellers in Camera & Photo</a>
    list for authoritative information on this product's current rank.)]]>
  </description>
</item>

再次,RSS中没有图像。您可以在说明中看到图像。这意味着显示描述将包括一个图像(在这种情况下好几个)。

如果你真的想要只获得一个图像,你可以解析描述的HTML并搜索<img>标签,并选择一个适合你最终页面大小的标签。看起来你已经尝试过了......

但是,您的代码似乎希望通过调用XML解析器来解析描述:

$rss = simplexml_load_string($rss);

不是这样,因为它是在CDATA中定义的,这意味着它看起来像是XML解析器的一大段文本(这是在这里做事的正确方法)。

所以你必须提取描述,分别解析,然后在sub-xml文档中搜索img标记(注意虽然在描述中它可能是HTML而不是XML,这意味着它可能无法编译XML解析器,因为某些标签没有针对XML正确关闭。这个版本似乎与XML兼容。)

$description = $rss->channel->item[$i]->description;
$desc_xml = simplexml_load_string($description);
// then you have multiple IMG, I'm not too sure how you get the count, I would imagine like this:
$max = count($desc_xml->img);
for($j = 0; $j < $max; ++$j)
{
   $url = $desc_xml->img[$j]->attributes()->src;
   ...
}

类似的东西。