为什么python认为我的变量是空的?

时间:2015-10-14 04:17:15

标签: python concatenation string-concatenation

在我在第3个代码块上输入2个if语句之前,我得到了几乎相同的错误,它无法连接str和Nonetype。

但是,当我在第3个if语句中取消注释print语句时,它会打印出一个带路径的URL列表。

我也在其他网站上试过这个,不仅仅是这个不起作用。

这是我的追溯

Traceback (most recent call last):
  File "linkcrawler.py", line 24, in <module>
    newurl = "http://" + b1 + b2
TypeError: cannot concatenate 'str' and 'NoneType' objects
Traceback (most recent call last):
  File "linkcrawler.py", line 24, in <module>
     newurl = "http://" + b1 + b2
TypeError: cannot concatenate 'str' and 'NoneType' objects

每次运行它我都会得到两个。

import urllib
from bs4 import BeautifulSoup
import traceback
import urlparse
import mechanize

url = "http://www.dailymail.co.uk/home/index.html"
br = mechanize.Browser()
urls = [url]
visited = [url]

while len(urls)>0:
    try:
        br.open(urls[0])
        urls.pop(0)
        for link in br.links():
            newurl = urlparse.urljoin(link.base_url,link.url)
            b1 = urlparse.urlparse(newurl).hostname
            b2 = urlparse.urlparse(newurl).path

            newurl = "http://"+b1+b2

            if newurl not in visited and urlparse.urlparse(url).hostname in newurl:
                urls.append(newurl)
                visited.append(newurl)
                #print newurl
    except:
        traceback.print_exc()
        urls.pop(0)
print visited

1 个答案:

答案 0 :(得分:0)

b1b2None。要解决此问题,请检查b1b2是否为空或None并重新构建代码:

b1 = urlparse.urlparse(newurl).hostname
b2 = urlparse.urlparse(newurl).path

if b1 and b2:
    newurl = "http://"+b1+b2
    if newurl not in visited and urlparse.urlparse(url).hostname in newurl:
        urls.append(newurl)
        visited.append(newurl)
        #print newurl
else:
    urls.pop(0)