这是错误:
文件“f **。py”,第34行,在模块中
url_type = url.split('-')[0][-2:] #
这是整个街区:
fit_urls = []
for event_url in event_urls:
print event_url
try:
sock = urllib.urlopen(event_url)
event_html = sock.read()
event_soup = BeautifulSoup(event_html)
tds = event_soup.find_all('td')
for td in tds:
for link in td.find_all('a'):
url = link.get('href')
url_type = url.split('-')[0][-2:] letters
if url_type == 'ht':
#print url
fit_urls.append(url)
except HTTPError:
pass
`
答案 0 :(得分:0)
这是因为您的任何'link'
都没有'href'
属性。您可以在执行print link
之前添加url = link.get('href')
来验证它。
要解决此问题,您可以添加额外的if
支票,以过滤以下链接:
for td in tds:
for link in td.find_all('a'):
url = link.get('href')
if url: # additional check. will be `False` when `'url'` will be `None`
url_type = url.split('-')[0][-2:] letters
# Your rest of the code
答案 1 :(得分:0)
看起来url = link.get('href')
正在返回None
。您可以在循环中检查None
:
for td in tds:
for link in td.find_all('a'):
url = link.get('href')
if not url:
continue
url_type = url.split('-')[0][-2:] letters
if url_type == 'ht':
#print url
fit_urls.append(url)