Question

尝试解析天气页面并选择每周预测的高点。

通常我会使用tags = soup.find_all("span", id="hi")进行搜索，但此标记不使用id，而是使用class。

完整代码：

import mechanize
from bs4 import BeautifulSoup

my_browser = mechanize.Browser()
html_page = my_browser.open("http://www.wunderground.com/weather-forecast/45056")
html_text = html_page.get_data()
my_soup = BeautifulSoup(html_text)

tags = my_soup.find_all("span", class_="hi")

temp = tags[0].string
print temp

当我运行时，没有打印

这段HTML被埋在一堆其他标签中，但是今天的高标签如下：

<span class="hi">63</span>

Answer 1

只需使用class_作为参数名称。见the docs.

问题出现是因为class是一个Python关键字，因此您无法直接使用它。

Answer 2

作为抓取网页的替代方法，您可以随时查看Weather Underground的API。它对开发人员是免费的（每天有限的通话次数等），但是如果你要进行大量的查询，最终可能会更容易。

Python：解析一个类什么都不打印？

2 个答案: