下面是我的代码...
from bs4 import BeautifulSoup
import requests
for count in range(1,2):
r = requests.get('http://manufacturer.indiatradepage.com/all/a_a_enterprises/' + str(count) + '/',headers={'User-Agent': 'Googlebot'})
soup = BeautifulSoup(r.text,'lxml')
data = soup.find('div',class_='container_main')
for links in data.find_all('div',class_='com_countainer'):
for link in links.find_all('a')[0:1]:
l = link['href']
r = requests.get(l)
soup = BeautifulSoup(r.text,'lxml')
data = soup.find('td',{"id":"intro_txt"})
table1 = data.find('table',{"style":"max-height: 400px !important"})
body1 = table1.find('div',class_='f_body')
table2 = data.find('div',{"id":"f3_1"})
div = table2.find('div',class_='f_body')
body2 = div.find('div',{"style":"text-transform:capitalize; "})
print body2.text + body1.text
我收到以下错误消息。
回溯(最近通话最近):文件 “ C:/Python27/indiatradepage_try.py”,第19行 body1 = table1.find('div',class _ ='f_body')AttributeError:'NoneType'对象没有属性'find'
由于以下错误,我的代码每次都被破坏。
答案 0 :(得分:0)
您可以通过不尝试在NoneType对象上使用属性.find
来解决此问题,这是您在body1 = table1.find('div',class_='f_body')
以及可能在table2 = data.find('div',{"id":"f3_1"})
您可以执行类似的操作,这将检查表是否为None,如果不是,则打印出该表不存在,而不是.find
,然后继续循环。
from bs4 import BeautifulSoup
import requests
for count in range(1,2):
r = requests.get('http://manufacturer.indiatradepage.com/all/a_a_enterprises/' + str(count) + '/',headers={'User-Agent': 'Googlebot'})
soup = BeautifulSoup(r.text,'lxml')
data = soup.find('div',class_='container_main')
for links in data.find_all('div',class_='com_countainer'):
for link in links.find_all('a')[0:1]:
l = link['href']
r = requests.get(l)
soup = BeautifulSoup(r.text,'lxml')
data = soup.find('td',{"id":"intro_txt"})
table1 = data.find('table',{"style":"max-height: 400px !important"})
if table1 != None:
body1 = table1.find('div',class_='f_body').text
else:
body1 = ' table1 no present '
table2 = data.find('div',{"id":"f3_1"})
if table2 != None:
div = table2.find('div',class_='f_body')
body2 = div.find('div',{"style":"text-transform:capitalize; "}).text
else:
body2 = ' table2 not present '
print (body2 + body1)