NameError:未定义名称'htmltext'

时间:2014-10-26 18:51:03

标签: python python-3.x

运行此脚本时出错:

import urllib.request
import urllib.parse
from bs4 import BeautifulSoup

url = "http://nytimes.com,http://nytimes.com"

urls = [url] #stack of urls to scrape
visited = [url] #historic record of urls

while len(urls) >0:
try:
    htmltext = urllib.request.urlopen(urls[0]).read()
except:
    print(htmltext)

原始scipt:

import urllib.request
import urllib.parse
from bs4 import BeautifulSoup

url = "http://nytimes.com,http://nytimes.com"

urls = [url] #stack of urls to scrape
visited = [url] #historic record of urls

while len(urls) >0:
try:
    htmltext = urllib.request.urlopen(urls[0]).read()
except:
    print(urls[0])
soup = BeautifulSoup(htmltext)

urls.pop(0)

print (soup.findAll('a',href=True))

错误:

  

socket.gaierror:[Errno -2]名称或服务未知

     

urllib.error.URLError:urlopen错误[Errno -2]名称或服务未知

     

追踪(最近一次呼叫最后一次):

     

NameError:name' htmltext'未定义

1 个答案:

答案 0 :(得分:2)

如果urllib.request.urlopen()引发异常,则永远不会为htmltext分配值(因此在except中打印该值将无效)。

至于urlopen()无效的原因,请确保传递的是有效网址。