数据框将每列保存在单独的CSV文件中

时间:2019-11-16 18:08:42

标签: python dataframe

我有以下代码将下载的Yahoo Finance保存到数据框中:

import bs4 as bs
import requests
import yfinance as yf
import datetime
import pandas

resp = requests.get('http://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
soup = bs.BeautifulSoup(resp.text, "html.parser") 
table = soup.find('table', {'class': 'wikitable sortable'})
tickers = []
for row in table.findAll('tr')[1:]:
    ticker = row.findAll('td')[0].text
    tickers.append(ticker)

tickers = [s.replace('\n', '') for s in tickers]
start = datetime.datetime(2019,1,1)
end = datetime.datetime(2019,7,17)
data = yf.download(tickers, start=start, end=end)
print(data)

print(type(data))

data.to_csv('stock_data.csv')

我得到这些结果:

            Adj Close                         ...     Volume
                    A        AAL         AAP  ...        ZBH       ZION        ZTS
Date                                          ...
2018-12-31        NaN        NaN         NaN  ...        NaN        NaN        NaN
2019-01-02  65.271561  32.081280  157.739120  ...  1152100.0  2234100.0  2665600.0
2019-01-03  62.866974  29.690985  162.663467  ...  1166100.0  2078400.0  2390900.0
2019-01-04  65.043022  31.646681  158.628098  ...  1580400.0  2370500.0  3383500.0
2019-01-07  66.424164  32.545509  160.955429  ...   900300.0  2459700.0  2360800.0
...               ...        ...         ...  ...        ...        ...        ...
2019-07-10  73.212486  32.718483  156.368958  ...   632600.0  1404700.0  1763700.0
2019-07-11  71.585976  32.807880  156.229019  ...   885000.0  1500800.0  1588000.0
2019-07-12  71.496178  33.552837  158.518127  ...   644100.0  1565400.0  1473400.0
2019-07-15  70.398537  33.383980  158.857986  ...  1188100.0  1415200.0  1255200.0
2019-07-16  69.799820  33.989880  161.696884  ...  1099400.0  1508700.0  1214600.0

如何将它们保存到单独的CSV文件中(一个用于“ A”,一个用于“ AAL”,一个用于“ AAP”,等等)?

3 个答案:

答案 0 :(得分:1)

您只需要为每一列进行迭代,例如,您应该执行以下操作:

import pandas as pd
df = pd.read_csv('stock_data.csv')
for column in df.columns:
    df[column].to_csv(column + '.csv')

答案 1 :(得分:0)

由于您以MultiIndex作为索引,因此您首先要获取相关列名的列表。然后,您可以对它们进行过滤以获取正确的子数据帧。

import pandas as pd
df = pd.read_csv('stock_data.csv')
stock_names = df.columns.get_level_values(1)
for stock in stock_names.unique():
    df.loc[:, stock_names == stock].to_csv(stock + '.csv')

如果已命名索引中的级别,则应使用get_level_values中的名称,以提高可读性。

如果要摆脱划分的数据框中的讨厌级别,请降低级别:

stock_df = df.loc[:, stock_names == stock].copy()
stock_df.columns = stock_df.columns.droplevel(1)

或重新排列级别,以便您可以使用更多的“天真的”访问权限来访问数据框:

df.columns = df.columns.reorder_levels([1,0])
...
df[stock].to_csv(stock + '.csv')

答案 2 :(得分:0)

这就是我想要的:

import bs4 as bs
import requests
import yfinance as yf
import datetime
import pandas

resp = requests.get('http://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
soup = bs.BeautifulSoup(resp.text, "html.parser") 
table = soup.find('table', {'class': 'wikitable sortable'})
tickers = []
for row in table.findAll('tr')[1:]:
    ticker = row.findAll('td')[0].text
    tickers.append(ticker)

tickers = [s.replace('\n', '') for s in tickers]
start = datetime.datetime(2010,1,1)
end = datetime.datetime(2019,11,18)

i=1
for ticker in tickers:
  print(i, "Ticker is : ", ticker)
  i=i+1
  data = yf.download(ticker, start=start, end=end)
  data.to_csv(ticker+'.csv')