我对此并不陌生,我将其作为学习的机会,但由于社区的帮助才走得很远。但是,我试图抓取这样的多个部分页面
https://m.the-numbers.com/movie/Black-Panther
特别是summary
,starring cast
和supporting cast
我已经成功地向csv写入了1个列表,但是似乎找不到写多个列表的方法。我正在寻找一种可扩展的解决方案,可以在其中继续向导出添加更多列表。
我尝试过的事情:
将它们放在单独的列表中,例如details, actors
,与details.extended
使用相同的列表,依此类推。
预期结果正在生成一个表格,例如:
标题:
title, amount,starName,StarCharacter
下面列出数据。
错误:
Exception has occurred: AttributeError'str' object has no attribute 'keys'
from bs4 import BeautifulSoup
import csv
import re
# Making get request
r = requests.get('https://m.the-numbers.com/movie/Black-Panther')
# Creating BeautifulSoup object
soup = BeautifulSoup(r.text, 'lxml')
# Localizing table from the BS object
table_soup = soup.find('div', class_='row').find('div', class_='table-responsive').find('table', id='movie_finances')
website = 'https://m.the-numbers.com/'
details = []
# Iterating through all trs in the table except the first(header) and the last two(summary) rows
for tr in table_soup.find_all('tr')[2:4]:
tds = tr.find_all('td')
# Creating dict for each row and appending it to the details list
details.extend({
'title': tds[0].text.strip(),
'amount': tds[1].text.strip(),
})
cast_soup = soup.find('div', id='accordion').find('div', class_='cast_new').find('table', class_='table table-sm')
for tr in cast_soup.find_all('tr')[2:15]:
tdc = tr.find_all('td')
# Creating dict for each row and appending it to the details list
details.append({
'starName': tdc[0].text.strip(),
'starCharacter': tdc[1].text.strip(),
})
# Writing details list of dicts to file using csv.DictWriter
with open('moviesPage2018.csv', 'w', encoding='utf-8', newline='\n') as csv_file:
writer = csv.DictWriter(csv_file, fieldnames=details[0].keys())
writer.writeheader()
writer.writerows(details)```