Question

我对此并不陌生，我将其作为学习的机会，但由于社区的帮助才走得很远。但是，我试图抓取这样的多个部分页面

https://m.the-numbers.com/movie/Black-Panther

特别是summary，starring cast和supporting cast

我已经成功地向csv写入了1个列表，但是似乎找不到写多个列表的方法。我正在寻找一种可扩展的解决方案，可以在其中继续向导出添加更多列表。

我尝试过的事情：

将它们放在单独的列表中，例如details, actors，与details.extended使用相同的列表，依此类推。

预期结果正在生成一个表格，例如：

标题： title, amount,starName,StarCharacter

下面列出数据。

错误： Exception has occurred: AttributeError'str' object has no attribute 'keys'

from bs4 import BeautifulSoup
import csv
import re

# Making get request
r = requests.get('https://m.the-numbers.com/movie/Black-Panther')

# Creating BeautifulSoup object
soup = BeautifulSoup(r.text, 'lxml')

# Localizing table from the BS object
table_soup = soup.find('div', class_='row').find('div', class_='table-responsive').find('table', id='movie_finances')
website = 'https://m.the-numbers.com/'
details = []
# Iterating through all trs in the table except the first(header) and the last two(summary) rows 

for tr in table_soup.find_all('tr')[2:4]:
    tds = tr.find_all('td')

    # Creating dict for each row and appending it to the details list
    details.extend({
        'title': tds[0].text.strip(),
        'amount': tds[1].text.strip(),
    })


cast_soup = soup.find('div', id='accordion').find('div', class_='cast_new').find('table', class_='table table-sm')
for tr in cast_soup.find_all('tr')[2:15]:
    tdc = tr.find_all('td')

    # Creating dict for each row and appending it to the details list
    details.append({
        'starName': tdc[0].text.strip(),
        'starCharacter': tdc[1].text.strip(),
    })

# Writing details list of dicts to file using csv.DictWriter
with open('moviesPage2018.csv', 'w', encoding='utf-8', newline='\n') as csv_file:
    writer = csv.DictWriter(csv_file, fieldnames=details[0].keys())
    writer.writeheader()
    writer.writerows(details)```

使用bs4将多个列表合并为一个组织的csv

0 个答案: