我正在使用带有mechanize的.get_data()方法,它似乎打印出我想要的html。我还会检查打印出来的类型,类型是' str'。
但是当我尝试使用BeautifulSoup解析str时,我收到以下错误:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-163-11c061bf6c04> in <module>()
7 html = get_html(first[i],last[i])
8 print type(html)
----> 9 print parse_page(html)
10 # l_to_store.append(parse_page(html))
11 # hfb_data['l_to_store']=l_to_store
<ipython-input-161-bedc1ba19b10> in parse_hfb_page(html)
3 parse html to extract info in connection with a particular person
4 '''
----> 5 soup = BeautifulSoup(html)
6 for el in soup.find_all('li'):
7 if el.find('span').contents[0]=='Item:':
TypeError: 'module' object is not callable
究竟是什么&#39;模块&#39;,以及如何获取get_data()返回html的内容?
答案 0 :(得分:4)
当您导入BeatufilulSoup时:
import BeautifulSoup
您正在导入包含类,函数等的模块。为了从BeautifulSoup模块实例化一个BeautifulSoup类实例,您需要导入它或使用包含模块前缀的全名,如yonili在上面的注释中所建议的:
from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(html)
或
import BeautifulSoup
soup = BeautifulSoup.BeautifulSoup(html)