使用For循环生成数据框

时间:2020-06-08 01:35:25

标签: python pandas dataframe dictionary finance

我正在做一些投资组合分析,并且正在尝试使用一个股票代码列表来获得一个工作功能来提取股票数据。这是我的清单:

Ticker_List={'Tickers':['SPY', 'AAPL', 'TSLA', 'AMZN', 'BRK.B', 'DAL', 'EURN', 'AMD', 
     'NVDA', 'SPG', 'DIS', 'SBUX', 'MMP', 'USFD', 'CHEF', 'SYY', 
     'GOOGL', 'MSFT']}

我正在通过此函数传递列表,如下所示:

Port=kit.d(Ticker_List)

def d(Ticker_List):
x=[]
for i in Ticker_List['Tickers']:
    x.append(Closing_price_alltime(i))
return x

def Closing_price_alltime(Ticker):
    Closedf=td_client.get_price_history(Ticker, period_type='year', period=20, frequency_type='daily', frequency=1)
    return Closedf

从TDAmeritrade提取数据并给我回信:

[{'candles': [{'open': 147.46875,'high': 148.21875,
               'low': 146.875,'close': 147.125,
               'volume': 6998100,'datetime': 960181200000},
              {'open': 146.625,'high': 147.78125,
               'low': 145.90625,'close': 146.46875,
               'volume': 4858900,'datetime': 960267600000},
               ...],
  'symbol': 'MSFT',
  'empty': False}]`

(这只是示例)

最后,我要清理:

Port=pd.DataFrame(Port)
Port=pd.DataFrame.drop(Port, columns='empty')`

哪个提供了DataFrame:

    candles                                                        symbol
0   [{'open': 147.46875, 'high': 148.21875, 'low': 146.875, 'close': 147.125, 'volume': 6998100, 'datetime': 960181200000}, {'open': 146.625, 'high': ...}  SPY
1   [{'open': 3.33259, 'high': 3.401786, 'low': 3.203126, 'close': 3.261161, 'volume': 80917200, 'datetime': 960181200000}, {'open': 3.284599, 'high':...}  AAPL

如何从嵌套的字典的每一行中获取收盘价并将其设置为列,同时将股票代码(当前在其自己的列中)作为收盘价列的标题。还有如何从每个嵌套字典中提取日期时间并将其设置为索引。

EDIT :更多信息

构建此DataFrame的原始方法是:

SPY_close=kit.Closing_price_alltime('SPY')
AAPL_close=kit.Closing_price_alltime('AAPL')
TSLA_close=kit.Closing_price_alltime('TSLA')
AMZN_close=kit.Closing_price_alltime('AMZN')
BRKB_close=kit.Closing_price_alltime('BRK.B')
DAL_close=kit.Closing_price_alltime('DAL')
EURN_close=kit.Closing_price_alltime('EURN')
AMD_close=kit.Closing_price_alltime('AMD')
NVDA_close=kit.Closing_price_alltime('NVDA')
SPG_close=kit.Closing_price_alltime('SPG')
DIS_close=kit.Closing_price_alltime('DIS')
SBUX_close=kit.Closing_price_alltime('SBUX')
MMP_close=kit.Closing_price_alltime('MMP')
USFD_close=kit.Closing_price_alltime('USFD')
CHEF_close=kit.Closing_price_alltime('CHEF')
SYY_close=kit.Closing_price_alltime('SYY')
GOOGL_close=kit.Closing_price_alltime('GOOGL')
MSFT_close=kit.Closing_price_alltime('MSFT')

def Closing_price_alltime(Ticker):
    """
    Gets Closing Price for Past 20 Years w/ Daily Intervals
    and Formats it to correct Date and single 'Closing Price'
    column.
    """
    Raw_close=td_client.get_price_history(Ticker, 
    period_type='year', period=20, frequency_type='daily', frequency=1)
    #Closedf = pd.DataFrame(Raw_close['candles']).set_index('datetime')
    #Closedf=pd.DataFrame.drop(Closedf, columns=['open', 'high', 
                                                'low', 'volume'])
    #Closedf.index = pd.to_datetime(Closedf.index, unit='ms')
    #Closedf.index.names=['Date']
    #Closedf.columns=[f'{Ticker} Close']
    #Closedf=Closedf.dropna()
    return Closedf

    SPY_pct=kit.pct_change(SPY_close)
    AAPL_pct=kit.pct_change(AAPL_close)
    TSLA_pct=kit.pct_change(TSLA_close)
    AMZN_pct=kit.pct_change(AMZN_close)
    BRKB_pct=kit.pct_change(BRKB_close)
    DAL_pct=kit.pct_change(DAL_close)
    EURN_pct=kit.pct_change(EURN_close)
    AMD_pct=kit.pct_change(AMD_close)
    NVDA_pct=kit.pct_change(NVDA_close)
    SPG_pct=kit.pct_change(SPG_close)
    DIS_pct=kit.pct_change(DIS_close)
    SBUX_pct=kit.pct_change(SBUX_close)
    MMP_pct=kit.pct_change(MMP_close)
    USFD_pct=kit.pct_change(USFD_close)
    CHEF_pct=kit.pct_change(CHEF_close)
    SYY_pct=kit.pct_change(SYY_close)
    GOOGL_pct=kit.pct_change(GOOGL_close)
    MSFT_pct=kit.pct_change(MSFT_close)
def pct_change(Ticker_ClosingValues):
    """
    Takes Closing Values and Finds Percent Change.
    Closing Value Column must be named 'Closing Price'.
    """
    return_pct=Ticker_ClosingValues.pct_change()
    return_pct=return_pct.dropna()
    return return_pct

   Portfolio_hist_rets=[SPY_pct, AAPL_pct, TSLA_pct, AMZN_pct, 
                         BRKB_pct, DAL_pct, EURN_pct, AMD_pct, 
                         NVDA_pct, SPG_pct, DIS_pct, SBUX_pct, 
                         MMP_pct, USFD_pct, CHEF_pct, SYY_pct, 
                         GOOGL_pct, MSFT_pct]

确切返回了我想要的东西:

             SPY Close  AAPL Close  TSLA Close  AMZN Close  BRK.B Close
Date                    
2000-06-06 05:00:00 -0.004460   0.017111    NaN -0.072248   -0.002060
2000-06-07 05:00:00 0.006934    0.039704    NaN 0.024722    0.013416
2000-06-08 05:00:00 -0.003920   -0.018123   NaN 0.001206    -0.004073

与仅使用for循环从代码清单创建DataFrame相比,此方法显然效率低得多。

简而言之,我想问一下我的新代码(在编辑上方)可以进行哪些更改,以达到与旧代码(在编辑下方)相同的最终结果(格式正确并标记为DataFrame)。

2 个答案:

答案 0 :(得分:0)

Closing_price_alltime返回值:

d = [{'candles': [{'open': 147.46875,'high': 148.21875,
               'low': 146.875,'close': 147.125,
               'volume': 6998100,'datetime': 960181200000},
              {'open': 146.625,'high': 147.78125,
               'low': 145.90625,'close': 146.46875,
               'volume': 4858900,'datetime': 960267600000}
              ],
      'symbol': 'MSFT',
      'empty': False}]

您可以像这样提取符号,日期时间和结束时间。

import operator
import pandas as pd

data = operator.itemgetter('datetime','close')

symbol = d[0]['symbol']
candles = d[0]['candles']
dt, closing = zip(*map(data, candles))
# for loop equivalent to zip(*map...)
#dt = []
#closing = []
#for candle in candles:
#    dt.append(candle['datetime'])
#    closing.append(candle['close'])

s = pd.Series(data=closing,index=dt,name=symbol)

这将为列表中的每个交易品种创建一个收盘价的数据框。

results = []
for ticker in Ticker_List['Tickers']:
    d = Closing_price_alltime(ticker)
    symbol = d[0]['symbol']
    candles = d[0]['candles']
    dt, closing = zip(*map(data, candles))
    results.append(pd.Series(data=closing,index=dt,name=symbol))

df = pd.concat(results, axis=1)

pandas.DataFrame.pct_change

答案 1 :(得分:0)

这是我编写的最终功能,可以实现我的目标:

def Port_consol(Ticker_List):
    """
    Consolidates Ticker Symbol Returns and Returns
    a Single Portolio
    """
    Port=[]
    Port_=[]
    for i in Ticker_List['Tickers']:
        Port.append(Closing_price_alltime(i))
    j=list(range(0, (n_assets)))
    for i in j:
        data = operator.itemgetter('datetime','close')
        symbol = Port[i]['symbol']
        candles = Port[i]['candles']
        dt, closing = zip(*map(data, candles))
        s = pd.Series(data=closing,index=dt,name=symbol)
        s=pd.DataFrame(s)
        s.index = pd.to_datetime(s.index, unit='ms')
        Port_.append(s)
    Portfolio=pd.concat(Port_, axis=1, sort=False)
    return Portfolio

我现在可以通过代码列表传递给该函数,数据将从TDAmeritrade的API(使用python包td-ameritrade-python-api)中提取,并形成一个具有历史收盘价的DataFrame,其股票我经过的自动收报机。