Question

我正在用scrapy抓取这个网址：http://quotes.toscrape.com/

当我这样做时效果很好：

response.xpath("//meta[@itemprop='keywords']/@content").extract()
response.xpath("//meta[@itemprop='keywords'][1]/@content").extract_first()

但是当我尝试使用索引

从metas列表中获取第二个元数据时

response.xpath("//meta[@itemprop='keywords'][2]/@content").extract_first()

它没有用。

我错过了什么？

谢谢！

Answer 1

您需要在括号中的索引前包装表达式：

而不是：

"//meta[@itemprop='keywords'][2]/@content"

应该是：

"(//meta[@itemprop='keywords'])[2]/@content"

这是必需的，因为你的xpath中有参数运算符。

您可以测试一下：

$ scrapy shell "http://quotes.toscrape.com/"
In [1]: response.xpath("//meta[@itemprop='keywords'][2]/@content").extract_first()

In [2]: response.xpath("(//meta[@itemprop='keywords'])[2]/@content").extract_first()
Out[2]: 'abilities,choices'

Scrapy xpath刮痧元

1 个答案: