如何过滤multindex数据框(熊猫)中的列

时间:2020-08-05 11:14:40

标签: python pandas dataframe filter yfinance

我有以下数据框:

    WBA                                   ...                  HD                    
                                Open       High        Low      Close  ...      l-pc       h-l      h-pc      l-pc
Datetime                                                               ...                                        
2020-06-08 09:30:00-04:00  45.490002  46.090000  45.490002  46.049999  ...       NaN  2.100006       NaN       NaN
2020-06-08 09:35:00-04:00  46.070000  46.330002  46.040001  46.330002  ...  0.009998  1.119904  0.402496  0.717407
2020-06-08 09:40:00-04:00  46.330002  46.660000  46.240002  46.610001  ...  0.090000  0.874893  0.359894  0.514999
2020-06-08 09:45:00-04:00  46.624100  46.950001  46.624100  46.880001  ...  0.014099  0.639999  0.349991  0.290009
2020-06-08 09:50:00-04:00  46.880001  46.990002  46.820000  46.919998  ...  0.060001  0.490005  0.169998  0.320007

此数据框是使用以下代码获得的:

import yfinance as yf
import pandas as pd
import datetime as dt
end=dt.datetime.today()
start=end-dt.timedelta(59)
tickers=['WBA', 'HD']
ohlcv={}
df=pd.DataFrame
df = yf.download(tickers,group_by=tickers,start=start,end=end,interval='5m')
for i in tickers:
  df[i,"h-l"]=abs(df[i]['High']-df[i]['Low'])
  df[i,'h-pc']=abs (df[i]["High"]-df[i]['Adj Close'].shift(1))
  df[i,'l-pc']=abs(df[i]["Low"]-df[i]['Adj Close'].shift(1))
  

我正在尝试将此功能应用于“ tickers”列表中提到的所有代码:

  df['tr']=dff[['h-l','h-pc','l-pc']].max(axis=1)
  df['atr']=df['tr'].rolling(window=n, min_periods=n).mean()

对于代码,我需要找到“ tr”,然后使用tr我必须找到“ atr”,但我无法获得“ tr”

1 个答案:

答案 0 :(得分:0)

系统化地通过元组访问列,这一切都有效。

import yfinance as yf
import pandas as pd
import datetime as dt
end=dt.datetime.today()
start=end-dt.timedelta(59)
tickers=['WBA', 'HD']
ohlcv={}

# df = yf.download(tickers,group_by=tickers,start=start,end=end,interval='5m')
dfc = df.copy()
for t in tickers:
    dfc[(t,"h-l")] = abs(dfc.loc[:,(t,'High')] - dfc.loc[:,(t,'Low')])
    dfc[(t,"h-pc")] = abs(dfc.loc[:,(t,'High')] - dfc.loc[:,(t,'Adj Close')].shift(1))
    dfc[(t,"l-pc")] = abs(dfc.loc[:,(t,'Low')] - dfc.loc[:,(t,'Adj Close')].shift(1))

# access all the new columns through tuples e.g ("WBA","h-l") ...
dfc["tr"] = dfc[[(t, c) for t in tickers for c in ['h-l','h-pc','l-pc']]].max(axis=1)

n=5
dfc["atr"] = dfc['tr'].rolling(window=n, min_periods=n).mean()