webscrapping时发生AttributeError

时间:2019-08-25 01:06:56

标签: python web-scraping beautifulsoup attributeerror

在进行网络剪贴时收到AttributeError,但我不确定自己在做什么错? AttributeError是什么意思?

    response_obj = requests.get('https://en.wikipedia.org/wiki/Demographics_of_New_York_City').text
    soup = BeautifulSoup(response_obj,'lxml')
    Population_Census_Table = soup.find('table', {'class':'wikitable sortable'})

表格准备

    rows = Population_Census_Table.select("tbody > tr")[3:8]

    jurisdiction = []

    for row in rows:
        jurisdiction = {}
        tds = row.select('td')
        jurisdiction["jurisdiction"] = tds[0].text.strip()
        jurisdiction["population_census"] = tds[1].text.strip()
        jurisdiction["%_white"] = float(tds[2].text.strip().replace(",",""))
        jurisdiction["%_black_or_african_amercian"] = float(tds[3].text.strip().replace(",",""))
        jurisdiction["%_Asian"] = float(tds[4].text.strip().replace(",",""))
        jurisdiction["%_other"] = float(tds[5].text.strip().replace(",",""))
        jurisdiction["%_mixed_race"] = float(tds[6].text.strip().replace(",",""))
        jurisdiction["%_hispanic_latino_of_other_race"] = float(tds[7].text.strip().replace(",",""))
        jurisdiction["%_catholic"] = float(tds[7].text.strip().replace(",",""))
        jurisdiction["%_jewish"] = float(tds[8].text.strip().replace(",",""))

        jurisdiction.append(jurisdiction)

` `print(jurisdiction)
  

AttributeError

   ---> 18     jurisdiction.append(jurisdiction)
   AttributeError: 'dict' object has no attribute 'append'

1 个答案:

答案 0 :(得分:0)

您以jurisdiction作为列表开始,并立即将其作为字典。然后,您将其视为dict,直到错误行再次尝试将其视为列表。我认为您一开始需要为列表使用其他名称。您可能是指管辖区(复数)作为列表。但是,IMO还有另外两个区域也确实需要修复:

  1. 查找返回单个表。字典中的标签/键表示您要使用更高的表格(而不是第一个匹配项)

  2. 您为目标表建立的索引不正确

您想要类似的东西

import requests, re
from bs4 import BeautifulSoup

response_obj = requests.get('https://en.wikipedia.org/wiki/Demographics_of_New_York_City').text
soup = BeautifulSoup(response_obj,'lxml')
Population_Census_Table = soup.select_one('.wikitable:nth-of-type(5)') #use css selector to target correct table.
jurisdictions = []
rows = Population_Census_Table.select("tbody > tr")[3:8]
for row in rows:
    jurisdiction = {}
    tds = row.select('td')
    jurisdiction["jurisdiction"] = tds[0].text.strip()
    jurisdiction["population_census"] = tds[1].text.strip()
    jurisdiction["%_white"] = float(tds[2].text.strip().replace(",",""))
    jurisdiction["%_black_or_african_amercian"] = float(tds[3].text.strip().replace(",",""))
    jurisdiction["%_Asian"] = float(tds[4].text.strip().replace(",",""))
    jurisdiction["%_other"] = float(tds[5].text.strip().replace(",",""))
    jurisdiction["%_mixed_race"] = float(tds[6].text.strip().replace(",",""))
    jurisdiction["%_hispanic_latino_of_other_race"] = float(tds[7].text.strip().replace(",",""))
    jurisdiction["%_catholic"] = float(tds[10].text.strip().replace(",",""))
    jurisdiction["%_jewish"] = float(tds[12].text.strip().replace(",",""))
    jurisdictions.append(jurisdiction)