我遇到了一个奇怪的错误。我正在尝试做一些基本的解析。基本上,我正在以'x'格式收集数据,并希望以我可以使用的格式返回所有内容。我的直接问题是我的代码返回了一个奇怪的错误。我已经在这里查看了一些其他帖子/答案以查找相同的问题,但是脱离了上下文......确实很难确定问题。
data = url.text
soup = BeautifulSoup(data, "html5lib")
results = [] # this is what my result set will end up as
def parseDiv(text):
#function takes one input parameter - a single div for which it will parse for specific items, and return it all as a dictionary
soup2 = BeautifulSoup(text)
title = soup2.find("a", "yschttl spt")
print title.text
print
return title.text
for result in soup.find_all("div", "res"):
"""
This is where the data is first handled - this would return a div with links, text, etc -
So, I pass the blurb of text into the parseDiv() function
"""
item = parseDiv(result)
results.append(item)
显然,在这一点上,我已经包含了我需要的库......当我拉出soup2的代码时(在我要处理的新文本中,bs4的第二个实例化),只需打印我的函数输入这一切都有效。
这是错误:
Traceback (most recent call last):
File "testdata.py", line 29, in <module>
item = parseDiv(result)
File "testdata.py", line 17, in parseDiv
soup2 = BeautifulSoup(text)
File "C:\Python27\lib\site-packages\bs4\__i
markup = markup.read()
TypeError: 'NoneType' object is not callable
答案 0 :(得分:7)
您不需要再次解析div。试试这个:
for div in soup.find_all('div', 'res'):
a = div.find('a', 'yschttl spt')
if a:
print a.text
print
results.append(a)