我正在尝试将pandas面板切换为xarray.Dataset
我有一个从字典od数据帧创建的数据集。每个数据框包含一个股票的数据。数据帧行是交易日期,列是价格和指标。示例代码:
import pandas as pd
import xarray as xr
panel_dict = {}
panel_dict['AAPL'] = pd.DataFrame({'Open': [100, 105], 'Close': [104, 108],
'SMA200':[102, 110], 'RSI2': [11 , 14]},
index=['2017-09-01', '2017-09-02'])
panel_dict['AMZN'] = pd.DataFrame({'Open': [200, 180], 'Close': [190, 170],
'SMA200':[190, 190], 'RSI2': [11 , 15]},
index=['2017-09-01', '2017-09-02'])
panel_dict['AGN'] = pd.DataFrame({'Open': [300, 310], 'Close': [300, 310],
'SMA200':[250, 250], 'RSI2': [5 , 15]},
index=['2017-09-01', '2017-09-02'])
ds_full = xr.Dataset(panel_dict)
print(ds_full)
# selecting one day works
ds = ds_full.sel(dim_0 = '2017-09-02')
print(ds)
# filtering does not work
c = ds[ds['Close']>ds['SMA200']]
c = c[c['RSI2'] < 12.0 ]
c = c.sort_values(by = 'RSI2', ascending=True)
数据集ds_full如下所示:
<xarray.Dataset>
Dimensions: (dim_0: 2, dim_1: 4)
Coordinates:
* dim_0 (dim_0) object '2017-09-01' '2017-09-02'
* dim_1 (dim_1) object 'Close' 'Open' 'RSI2' 'SMA200'
Data variables:
AAPL (dim_0, dim_1) int64 104 100 11 102 108 105 14 110
AMZN (dim_0, dim_1) int64 190 200 11 190 170 180 15 190
AGN (dim_0, dim_1) int64 300 300 5 250 310 310 15 250
<xarray.Dataset>
使用ds = ds_full.sel选择1天数据(dim_0 =&#39; 2017-09-02&#39;)效果很好:
<xarray.Dataset>
Dimensions: (dim_1: 4)
Coordinates:
dim_0 <U10 '2017-09-02'
* dim_1 (dim_1) object 'Close' 'Open' 'RSI2' 'SMA200'
Data variables:
AAPL (dim_1) int64 108 105 14 110
AMZN (dim_1) int64 170 180 15 190
AGN (dim_1) int64 310 310 15 250
但是如何过滤一些其他条件,例如&#39;关闭&#39; &GT; &#39; SMA200&#39;或者&#39; RSI2&#39; &LT; 12?以及如何通过RSI2&#39;对结果进行排序列?
在使用pandas.panel的原始代码中,它是这样的:
c = ds[ds['Close']>ds['SMA200']]
c = c[c['RSI2'] < 12.0 ]
c = c.sort_values(by = 'RSI2', ascending=True)