我收到了这个错误:
NameError: name 'htmltext' is not defined
它来自以下代码:
from bs4 import BeautifulSoup
import urllib
import urllib.parse
url = "http://nytimes.com"
urls = [url]
visited = [url]
while len(urls) > 0:
try:
htmltext = urllib.urlopen(urls[0]).read()
except:
print(urls[0])
soup = BeautifulSoup(htmltext)
urls.pop(0)
print(soup.findAll('a',href = true))
答案 0 :(得分:1)
在Python 3.x中,您必须导入urllib.request
而不是urllib
。然后,更改行:
htmltext = urllib.urlopen(urls[0]).read()
为:
htmltext = urllib.request.urlopen(urls[0]).read()
最后,将true
更改为True
。