Python错误NoneType

时间:2013-06-14 21:21:51

标签: python

我有一个Python脚本,它使用BS4来获取网页的html。然后我在html中找到一个特定的头字段来提取文本。我这样做有以下几点:

r = br.open("http://example.com")
html = r.read()
r.close()
soup = BeautifulSoup(html)
# Get the contents of the html tag (h1) that displays results
searchResult = soup.find("h1").contents[0]
# Get only the number, remove all text
if not(searchResult == None):
    searchResultNum = int(re.match(r'\d+', searchResult).group())
else:
    searchResultNum = 696969

实际的HTML代码不会改变。它总是这样:

<div id="resultsCount">
    <h1 class="f12">606 Results matched</h1>
</div>

问题是,我的脚本可能运行4分钟(不同)并且崩溃:

Traceback (most recent call last):
  File "C:\Users\Me\Documents\Aptana Studio 3 Workspace\PythonScripts\PythonScripts\setupscript.py", line 109, in <module>
    searchResultNum = int(re.match(r'\d+', searchResult).group())
AttributeError: 'NoneType' object has no attribute 'group'

我以为我正在处理这个错误。我想我只是不明白。你能帮忙吗?

感谢。

1 个答案:

答案 0 :(得分:1)

如果searchResult未以数字re.match(r'\d+', searchResult)开头,则NoneNone没有组属性。另外if not(searchResult == None):有点不好,请使用if searchResult:

searchResultNum = 696969
if searchResult:
    m = re.match(r'\d+', searchResult)
    if m:
        searchResultNum = int(m.group())