我正在尝试为使用urllib3下载网页并将其存储在字典中的作业编写程序。 (我使用spyder 3.6) 该程序给了我一个属性错误'而且我不知道我做错了什么。这是我为代号写的一步一步的笔记代码。
#Downloading a webpage
import urllib3
import sys
#these import statements allow us to use 'modules' aka 'libraries' ....
#code written by others that we can use
urlToRead = 'http://www.google.com'
#This value won't actually get used, because of the way the while loop
#below is set up. But while loops often need a dummy value like this to
#work right the first time
crawledWebLinks = {}
#Initialize an empty dictionary, in which (key, value) pairs will correspond to (short, url) eg
#("Goolge" , "http://www.google.com")
#Ok, there is a while loop coming up
#Here ends the set up
while urlToRead != ' ':
#This is a condition that dictates that the while loop will keep checking
#as long as this condition is true the loop will continue, if false it will stop
try:
urlToRead = input("Please enter the next URL to crawl")
#the "try" prevents the program from crashing if there is an error
#if there is an error the program will be sent to the except block
if urlToRead == '':
print ("OK, exiting loop")
break
#if the user leaves the input blank it will break out of the loop
shortName = input("Please enter a short name for the URL " + urlToRead)
webFile = urllib3.urlopen(urlToRead).read()
#This line above uses a ready a readymade function in the urllib3 module to
#do something super - cool:
#IT takes a url, goes to the website for the url, downloads the
#contents (which are in the form of HTML) and returns them to be
#stored in a string variable (here called webFile)
crawledWebLinks[shortName] = webFile
#this line above place a key value pair (shortname, HTML for that url)
#in the dictionary
except:
#this bit of code - the indented lines following 'except:' will be
#excecuted if the code in the try block (the indented following lines
#the 'try:' above) throw and error
#this is an example of something known as exeption-handling
print ("*************\nUnexpected Error*****", sys.exc_info()[0])
#The snip 'sys.exc_info()[0]' return information about the last
#error that occurred -
#this code is made available through the sys library that we imported above
#Quite Magical :)
stopOrProceed = input("Hmm..stop or proceed? Enter 1 to stop, enter anything else to continue")
if stopOrProceed ==1 :
print ('OK...Stopping\n')
break
#this break will break out of the nearest loop - in this case,
#the while loop
else:
print ("Cool! Let's continue\n")
continue
# this continue will skip out of the current iteration of this
#loop and move to the next i.e. the loop will reset to the start
print (crawledWebLinks.keys())
答案 0 :(得分:1)
您的问题是,您尝试拨打urllib3.urlopen()
,urllib3
没有会员urlopen
这是一个有效的代码段。我所做的就是用urllib3
替换urllib.request
:
import urllib.request
import sys
urlToRead = 'http://www.google.com'
crawledWebLinks = {}
while urlToRead != ' ':
try:
urlToRead = input("Please enter the next URL to crawl: ")
if urlToRead == '':
print ("OK, exiting loop")
break
#if the user leaves the input blank it will break out of the loop
shortName = input("Please enter a short name for the URL " + urlToRead + ": ")
webFile = urllib.request.urlopen(urlToRead).read()
crawledWebLinks[shortName] = webFile
except:
print ("*************\nUnexpected Error*****", sys.exc_info()[0])
stopOrProceed = input("Hmm..stop or proceed? Enter 1 to stop, enter anything else to continue")
if stopOrProceed ==1 :
print ('OK...Stopping\n')
break
else:
print ("Cool! Let's continue\n")
continue
print (crawledWebLinks)
另一个注意事项,只是在except
块中打印出错误类型并不是很有用。我删除了并查看了实际的回溯后,我能够在30秒内调试您的代码。