如何将TR元素和TD元素排列成结构化格式?

时间:2018-12-20 18:48:40

标签: python python-3.x

我整理了一些代码,这些代码从CSV导入数据,将其转换为数据框,遍历数据框的一列,并从多个URL中的一堆表中导入数据。这是我的脚本。

from bs4 import BeautifulSoup
import requests
import pandas as pd


df = pd.read_csv('C:\\Users\\ryans\\OneDrive\\Desktop\\Briefcase\\NY Times Dates\\exchanges.csv')
print(df)

for index, row in df.iterrows():
    passin = 'https://markets.on.nytimes.com/research/markets/holidays/holidays.asp?display=market&exchange='+row["Symbol"]
    r=requests.get(passin)
    #print(r)
    data = r.text
    #print(data)
    soup = BeautifulSoup(data, "html.parser")

    table = soup.find( "table", {"id":"holidayTable"} )

    for row in table.findAll("tr"):
        for cell in row("td"):
            print (cell.get_text().strip())

输出看起来像这样:

01/01/2018
Monday
New Year's Day
Santiago Stock ExchangeChile
01/16/2018
Tuesday
Public Holiday 2
Santiago Stock ExchangeChile
03/30/2018
Friday
Good Friday
etc., etc., etc.

如何将其放入与网页内容匹配的数据框中?我希望它基本上看起来像这样。

enter image description here

0 个答案:

没有答案