import bs4,requests, re
#Get epsiode webpage
epPage = requests.get('http://www.friends-tv.org/zz101.html')
epPage.raise_for_status()
#use the page in bs4
soup = bs4.BeautifulSoup(epPage.text, 'lxml')
results = soup.find_all('dt')
#Populate the list
quotes = []
for result in results:
character = result.find('b').text
speech = result.contents[1][1:-2]
quotes.append((character,speech))
print (quotes)`
我试图获取一个引号列表以及从该网站上说出来的人物:http://www.friends-tv.org/zz101.html。 但是,我收到错误:
Traceback (most recent call last):
File "/Users/yusufsohoye/pythoncode/Friends1.py", line 16, in <module>
character = result.find('b').text
AttributeError: 'NoneType' object has no attribute 'text'
当我在结果列表中隔离每个dt项时,它工作,但是当我尝试解析整个页面并构建列表时,它不起作用。
谢谢
答案 0 :(得分:1)
这应该有所帮助。
import bs4,requests, re
#Get epsiode webpage
epPage = requests.get('http://www.friends-tv.org/zz101.html')
epPage.raise_for_status()
#use the page in bs4
soup = bs4.BeautifulSoup(epPage.text, 'lxml')
results = soup.find_all('dt')
#Populate the list
quotes = []
for result in results:
character = result.find('b')
if character: #Check Condition to see if character in dt tag
speech = result.contents[1][1:-2]
squotes.append((character,speech))
print(quotes)