在Pandas中每周对数据进行分组后,无法调用特定列

时间:2018-08-02 18:18:29

标签: python pandas grouping resampling quantitative-finance

我曾尝试将Pandas中的每日数据转换为每周数据,但是当我想对分组数据框中的特定列执行计算时,我无法这样做。

这是我的代码:

df_w['Date'] = pd.to_datetime(df_w['Date'])
df_w.set_index('Date', inplace=True)
df_w.sort_index(inplace=True)




def take_first(array_like):
    return array_like[0]

def take_last(array_like):
    return array_like[-1]

output_w = df_w.resample('W',                                 # Weekly resample
                how={'Date2': take_first,
                    'Open': take_first, 
                     'High': 'max',
                     'Low': 'min',
                     'Close': take_last,
                     'Volume': 'sum'}, 
                loffset=pd.offsets.timedelta(days=-6))  # to put the labels to Monday


df_w = output_w[['Date2','Open', 'High', 'Low', 'Close', 'Volume']]

df_w[ 'EMA_40' ] = df_w['Close'].ewm( span = 40, adjust=False ).mean()

这是错误消息:

df_w[ 'EMA_40' ] = df_w['Close'].ewm( span = 40, adjust=False ).mean()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/window.py", line 2178, in mean
return self._apply('ewma', **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/window.py", line 2149, in _apply
values = self._prep_values(b.values)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/window.py", line 219, in _prep_values
dtype=values.dtype))
NotImplementedError: ops for EWM for this dtype datetime64[ns] are not implemented

我意识到在对数据进行分组时我犯了一个错误,因为在打印新的“关闭”时,不是打印一列,而是打印“打开”,“高”,“低”和“关闭”的整个数据框

在打印df_w和df_w ['Close']时,这是输出:

          index                  close   ...      Open      Low     High
Date                                           ...                            
2014-07-28   1008 2014-08-01   41.41   ...     41.12  41.0200  41.7300
2014-08-04   1003 2014-08-08   41.44   ...     40.79  40.7800  41.4600
2014-08-11    998 2014-08-15   41.97   ...     41.84  41.7400  42.2200
2014-08-18    993 2014-08-22   42.45   ...     42.75  42.2100  42.7500

有没有办法纠正这个问题?

我可以从Quandl中获取数据,然后可以使用,但是Quandl在2018年3月27日之后不再提供数据。

编辑:这是重采样之前的输出:

print ( df_w[ 'Close' ])
print ( df_w[ 'High' ])

2018-08-01 00:00:00    52.450
2018-08-03 13:35:00    52.610
Name: Close, Length: 1009, dtype: float64

018-08-01 00:00:00    52.6000
2018-08-03 13:35:00    52.8700
Name: High, Length: 1009, dtype: float64

0 个答案:

没有答案
相关问题