我知道这应该很简单,这就是令人沮丧的原因。我搜索了与此类似的问题,我确实知道“AttributeError:'NoneType'对象没有属性'find'”的意思。
我正在制作一个网页抓取脚本,该脚本从this网站获取公司标题,名称等。令人困惑的是,两个搜索功能,公司名称和主要联系人工作完美,而任何公司名称,电子邮件和电话号码都没有。即使所有这些都存在,并且结构与有效搜索的结构相同。谁能指出这些有什么不同?
我的代码在这里,前两个搜索工作,但第三个返回None。我是python的新手,任何帮助都会受到赞赏。
# import libraries
import urllib2
from bs4 import BeautifulSoup
import csv
from datetime import datetime
data = []
for i in range (0, 4):
quote_page = 'http://www.homeopathycenter.org/professional-and-organizational-directory' + '?field_geofield_distance%5Bdistance%5D=50&field_geofield_distance%5Bunit%5D=3959&field_geofield_distance%5Borigin%5D=&field_professional_category_tid=All&combine=&field_address_locality=&field_address_administrative_area=&field_address_country=All&field_consultations_by_phone_value=All&field_consultations_online_value=All&field_animal_consultations_value=All&page={}'.format(i)
page = urllib2.urlopen(quote_page)
# parse the html using beautiful soup and store in variable `soup`
soup = BeautifulSoup(page, 'html.parser')
#identify full contact box
box_array = ['views-row views-row-1 views-row-odd views-row-first',
'views-row views-row-2 views-row-even',
'views-row views-row-3 views-row-odd',
'views-row views-row-4 views-row-even',
'views-row views-row-5 views-row-odd',
'views-row views-row-6 views-row-even',
'views-row views-row-7 views-row-odd',
'views-row views-row-8 views-row-even',
'views-row views-row-9 views-row-odd',
'views-row views-row-10 views-row-even views-row-last']
#loop the boxes
for box in box_array:
full_box = soup.find('div', {'class': box})
#get title from box
title_box = full_box.find('div', {'class': 'views-field views-field-title'})
title_content = title_box.find('span', {'class': 'field-content'})
title = title_content
title = title.text.strip() # strip() is used to remove starting and trailing
#get contact name from box, this works
contact_box = full_box.find('div', {'class': 'views-field views-field-field-primary-contact'})
contact_content = contact_box.find('div', {'class': 'field-content'})
name = contact_content, this works
name = name.text.strip()
#get company name from box, this returns none
company_box = full_box.find('div', {'class': 'views-field views-field-field-company-name'})
company_content = company_box.find('div', {'class': 'field-content'})
company = company_content.text.strip()
编辑:错误代码如下:
追踪(最近一次通话): 文件“F:\ Program Files(x86)\ Python \ sc3.py”,第46行,in company_content = company_box.find('div',{'class':'field-content'}) AttributeError:'NoneType'对象没有属性'find'