Question

我制作了一个脚本，用于从网站上抓取一些数据，但它只运行几页，之后它会停止显示“'NoneType'对象没有属性'a'”。另一个错误是有时会出现这样的：

File "scrappy3.py", line 31, in <module>
f.writerow(doc_details)
File "C:\python\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u015f' in 
position 251: character maps to <undefined>

请你给我一个如何解决这些错误的建议。这是我的剧本：

import requests
import csv
from bs4 import BeautifulSoup
import re
import time

start_time = time.time()
page = 1
f = csv.writer(open("./doctors.csv", "w", newline=''))
while page <= 5153:
    url = "http://www.sfatulmedicului.ro/medici/n_s0_c0_h_s0_e0_h0_pagina" + str(page)
    data = requests.get(url)
    print ('scraping page ' + str(page))
    soup = BeautifulSoup(data.text,"html.parser")
    for liste in soup.find_all('li',{'class':'clearfix'}):
        doc_details = []
        url_doc = liste.find('a').get('href')
        for a in liste.find_all('a'):
            if a.has_attr('name'):
                doc_details.append(a['name'])   
        data2 = requests.get(url_doc)       
        soup = BeautifulSoup(data2.text,"html.parser")
        a_tel = soup.find('div',{'class':'contact_doc add_comment'}).a              
        tel_tag=a_tel['onclick']
        tel = tel_tag[tel_tag.find("$(this).html("):tel_tag.find(");")].lstrip("$(this).html(") 
        doc_details.append(tel)         
    f.writerow(doc_details)

    page += 1
print("--- %s seconds ---" % (time.time() - start_time))

Answer 1

您的错误在这里

   a_tel = soup.find('div',{'class':'contact_doc add_comment'}).a

soup.find显然没有找到所需类别的div。返回值为None，根据定义，它没有属性。

你应该检查并决定是否{@ 1}}进一步查询循环或纾困。例如：

continue

您还可以尝试使用div_contact = soup.find('div',{'class':'contact_doc add_comment'}) if div_contact is None: continue a_tel = div_contact.a块来覆盖更多案例（例如try .. except实际上没有您期望的内容）

div

理论上更像是Pythonic。无论如何都是你的选择。

连续和持续的错误检查是程序的一部分。

Answer 2

resp_find = soup.find('div',{'class':'contact_doc add_comment'})
if resp_find is not None:
    a_tel = resp_find.a

您可以查询soup.find（）的响应是否为NoneType对象，如果不是，则可以应用.a

或者您确保soup.find（）方法永远不会返回NoneType对象，因此您必须调查此方法为何提供NoneType对象

为什么我得到“'NoneType'对象没有属性”错误

2 个答案: