代码无法正确解析表,我无法找到无法找到“”表数据的确切原因。有人可以协助吗?
from bs4 import BeautifulSoup
import requests
import pandas as pd
url = "https://webapps1.cityofchicago.org/activeecWeb/"
r = requests.get(url)
data = r.text
soup = BeautifulSoup(data, "html.parser")
table = soup.find_all('table')[1]
rows = table.find_all('tr')[1:]
data = {
'LicenseType' : [],
'CompanyName' : [],
'Address' : [],
'Phone' : [],
'Expiration' : []
}
for row in rows:
cols = row.find_all('td')
data['LicenseType'].append( cols[0].get_text() )
data['CompanyName'].append( cols[1].get_text() )
data['Address'].append( cols[2].get_text() )
data['Phone'].append( cols[3].get_text() )
data['Expiration'].append( cols[4].get_text() )
electricians = pd.DataFrame( data )
electricians.to_csv("ChicagoElectriciansData.csv")
答案 0 :(得分:0)
您遇到的错误是由于该表的最后import random
from collections import namedtuple
qa = namedtuple('QA', ['question', 'answer'])
question_list = [
qa(" the marines are ___ based opreatives ", 'sea'),
qa(" for under water travel marines use _____ ",'submarines'),
qa(" the avergae marine trains for _ weeks ", '13')
]
random.shuffle(question_list)
correct = 0
for qa in question_list:
print(qa.question)
user_answer = input("fill in the blank with the correct word ")
if user_answer == qa.answer:
correct = correct + 1
。您可以使用tr
子句忽略该错误。但是,使用此try/except
也可以解决我在这里所做的问题:
.find_all("tr")[1:-1]