我一直在构建一个小小的网络刮刀,我觉得我的变量或功能范围做错了。每当我尝试将一些功能提取到单独的函数中时,它就会给出NameError:全局名称'NAME'未定义。我看到很多人都遇到了类似的问题,但是同样的错误似乎存在很多变化,我无法弄明白。
import urllib2, sys, urlparse, httplib, imageInfo
from BeautifulSoup import BeautifulSoup
from collections import deque
global visited_pages
visited_pages = []
global visit_queue
visit_queue = deque([])
global motorcycle_pages
motorcycle_pages = []
global motorcycle_pics
motorcycle_pics = []
global count
count = 0
def scrapePages(url):
#variables
max_count = 20
pic_num = 20
#decide how long it should go on...
global count
if count >= max_count:
return
#this is all of the links that have been scraped
the_links = []
soup = soupify_url(url)
#find all the links on the page
for tag in soup.findAll('a'):
the_links.append(tag.get('href'))
visited_pages.append(url)
count = count + 1
print 'number of pages visited'
print count
links_to_visit = the_links
# print 'links to visit'
# print links_to_visit
for link in links_to_visit:
if link not in visited_pages:
visit_queue.append(link)
print 'visit queue'
print visit_queue
while visit_queue:
link = visit_queue.pop()
print link
scrapePages(link)
print '***done***'
the_url = 'http://www.reddit.com/r/motorcycles'
#call the function
scrapePages(the_url)
def soupify_url(url):
try:
html = urllib2.urlopen(url).read()
except urllib2.URLError:
return
except ValueError:
return
except httplib.InvalidURL:
return
except httplib.BadStatusLine:
return
return BeautifulSoup.BeautifulSoup(html)
这是我的引用:
Traceback (most recent call last):
File "C:\Users\clifgray\Desktop\Mis Cosas\Programming\appengine\web_scraping\src\test.py", line 68, in <module>
scrapePages(the_url)
File "C:\Users\clifgray\Desktop\Mis Cosas\Programming\appengine\web_scraping\src\test.py", line 36, in scrapePages
soup = soupify_url(url)
NameError: global name 'soupify_url' is not defined
答案 0 :(得分:5)
移动主要代码:
the_url = 'http://www.reddit.com/r/motorcycles'
#call the function
scrapePages(the_url)
在您定义soupify_url
之后,即。你文件的底部。
Python正在读取定义了def scrapePages()
,然后它试图调用它; scrapePages()
想要调用一个名为soupify_url()
的函数,该函数尚未定义,因此您获得了:
NameError: global name 'soupify_url' is not defined
请记住规则:所有功能必须在任何执行实际工作的代码之前定义
如果您将调用scrapePages()
的主代码移至soupify_url()
定义之后,将定义所有内容并在范围内,则应解决您的错误。