Scrapy启动响应不会返回完整的html

时间:2019-04-01 02:28:55

标签: javascript python web-scraping scrapy-splash

我正在尝试清除此链接https://www.myntra.com/women-kurtas-kurtis-suits。但是当尝试通过启动HTTP API呈现它时。我在部分渲染结果enter image description here

下面得到了此结果

我在这里想念什么吗?

这是实际页面。

enter image description here

1 个答案:

答案 0 :(得分:2)

如果您要抓取产品,为什么不使用默认返回的非JS呈现的HTML?您将在其中找到包含产品详细信息的JSON对象。这是您发布的网页中的示例:

    <script type="application/ld+json"> {"@context":"https://schema.org","@type":"Product","name":"AKS Women Blue & Grey Printed Kurta with Palazzos","image":"http://assets.myntassets.com/assets/images/8076903/2018/12/8/fb0cf882-a473-4aae-86c2-edf912b70b6e1544251004970-AKS-Women-Kurta-Sets-2261544251003921-1.jpg","description":"Women Printed Kurta with Palazzos","brand":{"@type":"Thing"},"offers":{"@type":"Offer","priceCurrency":"INR","price":989},"AggregateRating":{"@type":"AggregateRating","itemReviewed":"AKS Women Blue & Grey Printed Kurta with Palazzos","ratingCount":0,"reviewCount":""}}</script>

使用JSON Python库,您可以提取数据并根据需要使用它。