亚马逊产品的PHP XPath问题

时间:2012-07-16 12:47:38

标签: php regex xpath

我想从以下URL获取所有图像并使用以下Xpath查询但是时间查询返回null。

网址:

http://www.amazon.com/gp/browse.html?ie=UTF8&marketplaceID=ATVPDKIKX0DER&me=A219HML0CVO0HP  

Xpath查询:

$products = $xpath->evaluate('//div[@class="productTitle"]//img');  

2 个答案:

答案 0 :(得分:1)

我相信在img之前你有一个太多的正斜杠:

$xpath->evaluate('//div[@class="productTitle"]/img');

这应该匹配该链接中存在的以下HTML:

<div id="srProductTitle_B0000CBIFG_0" class="productTitle">
    <a href="https://rads.stackoverflow.com/amzn/click/com/B0000CBIFG" rel="nofollow noreferrer">
    <img src="http://ecx.images-amazon.com/images/I/51BZs4Gf5pL._SL160_AA160_.jpg" class="" border="0" alt="Product Details"  width="160" height="160"/><br clear="all" />Weed Eater 952701594 0.065-Inch-by-200-Foot Bulk Round String Trimmer Line
    </a>
</div>

答案 1 :(得分:0)

这可能会对你有帮助......

$subject = file_get_contents('http://www.amazon.com/gp/browse.html?ie=UTF8&marketplaceID=ATVPDKIKX0DER&me=A219HML0CVO0HP');
$string = preg_replace('/\s\s+/', '', $subject);

preg_match_all('/<a(.*?)href="(.*?)">(.*?)<img(.*?)src="(.*?)"(.*?)class=""(.*?)border="0"(.*?)alt="Product(.*?)Details/', $subject, $result, PREG_PATTERN_ORDER);

for ($i = 0; $i < count($result[0]); $i++) {
    echo "<pre>";
    echo $result[5][$i];
}

<强>感谢..... P2C