无法使用BeautifulSoup刮取h3标签

时间:2013-12-11 10:41:11

标签: python-2.7 beautifulsoup

有问题的网址是: http://www.amazon.com/s/field-keywords=machine%20learning

我想在第一页中提取所有匹配结果的链接,我相信相应的html是:

<h3 class="newaps">
    <a href="https://rads.stackoverflow.com/amzn/click/com/1600490069" rel="nofollow noreferrer">
    <span class="lrg bold"> … </span>
    </a>
    <span class="med reg"> … </span>
</h3>

这是我迄今为止没有运气的尝试:

In [39]: url = "http://www.amazon.com/s/field-keywords=machine%20learning"

In [40]: page = urllib2.urlopen(url)

In [42]: html = page.read()

In [44]: soup = BeautifulSoup(html)

In [45]: soup.findAll('h3',{'class': 'newaps'})
Out[45]: []

我不确定我在这里做错了什么?

0 个答案:

没有答案