enter image description here

Question

真的需要这个社区的帮助。

我的问题是，当我使用代码时

=============================================== ========================== response.xpath（＆＃34; // DIV [含有（@class，＆＃39;检查-价格-插件-不发起＆＃39）] /一个/格[含有（@class，＆＃39;检查 - 价格-插件-不赞助链接＆＃39）]＆＃34;。）提取物（）

enter image description here

在scrapy shell中提取供应商名称，输出为空。我真的不知道为什么会这样，在我看来问题可能是网站信息是动态更新的？

此网页报废的网址为：https://cruiseline.com/cruise/7-night-bahamas-florida-new-york-roundtrip-32860，我需要的是每个供应商的供应商名称和价格。除了附加的图片是＆＃34; inspect＆＃34;。

的屏幕截图

非常感谢帮助！

Answer 1

您需要始终在浏览器中检查HTML源代码（通常使用 Ctrl + U ）。

通过这种方式，您可以使用JSON：

找到所需的信息嵌入Javascript变量中

var partnerPrices = [{"pool":"9a316391b6550eef969c8559c14a380f","partner":"ncl.com","priority":0,"currency":"USD","data":{"32860":{"2018-02-25":{"Inside":579,"Suite":1199,"Balcony":699,"Oceanview":629},....
var sponsored_partners = [{"code":"CDCNA","name":"cruises.com","value":"cruises.com","logo":"\/images\/partner-logo-cruises-sm.png","logo_sprite":"partner-logo-cruises-com"},...

所以你需要import json，解析response.body（使用re或其他方法）和下一个json.loads()解析的JSON字符串来迭代两个数组。

Web Scrapping在Scrapy中使用Xpath返回空值

enter image description here

1 个答案: