XPath没有找到任何结果

时间:2017-06-23 23:58:03

标签: python xpath python-requests lxml

使用Python 3.4,lxml和请求来刮取谷歌趋势。

在此示例中,我正在尝试检索位于这些span标记之间的文本“Johnny Depp”。我是lxml模块和XPath语法的新手,但我不确定此时我做错了什么。

提前谢谢。

HTML:

<span class="hottrends-single-trend-title ellipsis-maker-inner">Johnny Depp</span>

代码:

from lxml import html
import requests

page = requests.get('https://trends.google.com/trends/hottrends')
tree = html.fromstring(page.content)

#This will create a list of trends:
trends = tree.xpath('//span[@class="hottrends-single-trend-title ellipsis-maker-inner"]/text()')

print('Trends: ', trends)

结果: enter image description here

1 个答案:

答案 0 :(得分:1)

使用相应的RSS URL,您可以使用item的XML解析器,甚至可以使用标准库中的title,因为XML结构比HTML版本简单得多。给定RSS XML,您可以遍历>>> from lxml import etree as ET >>> import requests >>> page = requests.get('https://trends.google.com/trends/hottrends/atom/feed?pn=p1') >>> root = ET.fromstring(page.content) >>> for trend in root.xpath('//item'): ... print trend.find('title').text ... spinner Old Navy Flip Flop Sale You Get Me Johnny Depp NHL Draft GLOW Despicable Me 3 Blake Griffin Robert Del Naja DJ Khaled Grateful Bella Thorne Tubelight interstellar Camila Cabello Mexico vs Russia Frank Mason Bam Adebayo TJ Leaf the house Dwyane Wade 元素并打印pyplot.show(),例如(虽然最高结果不再是Johnny Depp&#39;现在:)): / p>

pyplot.show(some_image_as_argument)