我有一个包含这两个函数的脚本:
# Getting content of each page
def GetContent(url):
response = requests.get(url)
return response.content
# Extracting the sites
def CiteParser(content):
soup = BeautifulSoup(content)
print "---> site #: ",len(soup('cite'))
result = []
for cite in soup.find_all('cite'):
result.append(cite.string.split('/')[0])
return result
当我运行程序时,我有以下错误:
result.append(cite.string.split('/')[0])
AttributeError: 'NoneType' object has no attribute 'split'
输出样本:
URL: <URL That I use to search 'can be google, bing, etc'>
---> site #: 10
site1.com
.
.
.
site10.com
URL: <URL That I use to search 'can be google, bing, etc'>
File "python.py", line 49, in CiteParser
result.append(cite.string.split('/')[0])
AttributeError: 'NoneType' object has no attribute 'split'
答案 0 :(得分:8)
可能会发生,字符串里面没有任何内容,而不是“无”类型,所以我可以假设首先检查你的字符串是不是“无”
# Extracting the sites
def CiteParser(content):
soup = BeautifulSoup(content)
#print soup
print "---> site #: ",len(soup('cite'))
result = []
for cite in soup.find_all('cite'):
if cite.string is not None:
result.append(cite.string.split('/'))
print cite
return result
答案 1 :(得分:1)
for cite in soup.find_all('cite'):
if( (cite.string is None) or (len(cite.string) == 0)):
continue
result.append(cite.string.split('/')[0])