我正在研究的xml看起来像这样:
<item>
<title>$39.99 and Under Juniors' Swimwear</title>
<link>http://www.amazon.com/s/ref=xs_gb_rss_A1RFRNENBWTVO4/?rh=n:1036592,n:!2334084011,n:!2334146011,n:8021415011,p_6:ATVPDKIKX0DER&bbn=8021415011&ie=UTF8&qid=1398271335&rnid=15683531&ccmID=380205&tag=bugash-20</link>
<description><table><tr><td><a rel="nofollow" target="_blank" href="http://www.amazon.com/s/ref=xs_gb_rss_A1RFRNENBWTVO4/?rh=n:1036592,n:!2334084011,n:!2334146011,n:8021415011,p_6:ATVPDKIKX0DER&bbn=8021415011&ie=UTF8&qid=1398271335&rnid=15683531&ccmID=380205&tag=rssfeeds-20"><img src="http://ecx.images-amazon.com/images/I/31kwUz5PiZL._SL160_.jpg" alt="Product Image" style='border:0;'/></a></td><td><tr><td>$39.99 and Under Juniors' Swimwear</td></tr><tr><td>Expires May 10, 2014</td></tr></tr></table></description>
<guid isPermaLink="false">http://promotions.amazon.com/gp/goldbox/5159412---5opWoFoLiIfWceLGIhXzm2wwCMk=</guid>
<pubDate>Sat, 26 Apr 2014 07:00:00 +0000</pubDate>
</item>
我想从'description'标签中提取'img src'字段。我是否在php中这样做。?
答案 0 :(得分:0)
您可以将SimpleXML
组合用于XML,并通过DOMDocument
解析HTML。例如:
$xml_string = <<<XML
<item>
<title>$39.99 and Under Juniors' Swimwear</title>
<link>http://www.amazon.com/s/ref=xs_gb_rss_A1RFRNENBWTVO4/?rh=n:1036592,n:!2334084011,n:!2334146011,n:8021415011,p_6:ATVPDKIKX0DER&bbn=8021415011&ie=UTF8&qid=1398271335&rnid=15683531&ccmID=380205&tag=bugash-20</link>
<description><table><tr><td><a rel="nofollow" target="_blank" href="http://www.amazon.com/s/ref=xs_gb_rss_A1RFRNENBWTVO4/?rh=n:1036592,n:!2334084011,n:!2334146011,n:8021415011,p_6:ATVPDKIKX0DER&bbn=8021415011&ie=UTF8&qid=1398271335&rnid=15683531&ccmID=380205&tag=rssfeeds-20"><img src="http://ecx.images-amazon.com/images/I/31kwUz5PiZL._SL160_.jpg" alt="Product Image" style='border:0;'/></a></td><td><tr><td>$39.99 and Under Juniors' Swimwear</td></tr><tr><td>Expires May 10, 2014</td></tr></tr></table></description>
<guid isPermaLink="false">http://promotions.amazon.com/gp/goldbox/5159412---5opWoFoLiIfWceLGIhXzm2wwCMk=</guid>
<pubDate>Sat, 26 Apr 2014 07:00:00 +0000</pubDate>
</item>
XML;
$xml = simplexml_load_string($xml_string); // or simplexml_load_file('path/to/file.xml');
$description = (string) $xml->description;
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($description);
libxml_clear_errors();
$src = $dom->getElementsByTagName('img')->item(0)->getAttribute('src');
echo $src; // http://ecx.images-amazon.com/images/I/31kwUz5PiZL._SL160_.jpg
答案 1 :(得分:0)
$xml = simplexml_load_string($xmlstring);
$imgData=$xml->getElementsByTagName("description")[0];
$imgString=$imgData->nodeValue;
$explodedFirst=explode("href=", $imgString);
$firstSplit=$explodedFirst[1];
$explodedLast=explode("Expires", $firstSplit);
$finalURL=$explodedLast[0];