不了解此“ IndexError:列表索引超出范围”

时间:2019-06-11 21:59:09

标签: python html web-scraping beautifulsoup

代码无法正确解析表,我无法找到无法找到“”表数据的确切原因。有人可以协助吗?

from bs4 import BeautifulSoup
import requests
import pandas as pd 

url = "https://webapps1.cityofchicago.org/activeecWeb/"
r = requests.get(url)
data = r.text
soup = BeautifulSoup(data, "html.parser")


table = soup.find_all('table')[1]
rows = table.find_all('tr')[1:]

data = {
    'LicenseType' : [],
    'CompanyName' : [],
    'Address' : [],
    'Phone' : [],
    'Expiration' : []
}

for row in rows:
    cols = row.find_all('td')
    data['LicenseType'].append( cols[0].get_text() )
    data['CompanyName'].append( cols[1].get_text() )
    data['Address'].append( cols[2].get_text() )  
    data['Phone'].append( cols[3].get_text() )
    data['Expiration'].append( cols[4].get_text() )

electricians = pd.DataFrame( data )
electricians.to_csv("ChicagoElectriciansData.csv")

1 个答案:

答案 0 :(得分:0)

您遇到的错误是由于该表的最后import random from collections import namedtuple qa = namedtuple('QA', ['question', 'answer']) question_list = [ qa(" the marines are ___ based opreatives ", 'sea'), qa(" for under water travel marines use _____ ",'submarines'), qa(" the avergae marine trains for _ weeks ", '13') ] random.shuffle(question_list) correct = 0 for qa in question_list: print(qa.question) user_answer = input("fill in the blank with the correct word ") if user_answer == qa.answer: correct = correct + 1 。您可以使用tr子句忽略该错误。但是,使用此try/except也可以解决我在这里所做的问题:

.find_all("tr")[1:-1]