Question

我想从此website

获取所有商品名称和价格

例如，我想搜索＆＃34; apple＆＃34; https://redmart.com/search/apple

我使用Goutte来抓取网站。到目前为止，这是获取列表中所有项目名称的代码：

$client = new Client();

$crawler = $client->request('GET', 'https://redmart.com/search/apple');

$crawler->filter('h4 > a')->each(function ($node) {
    print $node->text()."\n";
});

但是当我运行代码时，它什么都不打印。如何从列表中获取所有商品的名称和价格？

Answer 1

redmart.com网站正在使用react js生成内容。你不能使用像Goutte这样的网站刮刀。相反，请尝试在Firefox或Google Chrome中使用开发者控制台，看看发生了什么。

在这种情况下，请求（通过ajax）返回JSON format并通过反应呈现的网址：https://api.redmart.com/v1.6.0/catalog/search?q=apple&pageSize=18&sort=1024&variation=BETA

使用PHP，您只需在响应中使用json_decode即可获得所需的一切。

Answer 2

不需要废弃网页，您只需要在网站上依赖API并使用poutput JSON，例如这是苹果列表的API：

https://api.redmart.com/v1.6.0/catalog/search?q=apple&pageSize=18&sort=1024&page=1&variation=BETA

使用PHP Goutte进行Web Scraping

2 个答案: