我有一个pandas DataFrame,其中包含来自特定交易所的1分钟OHLCV蜡烛(开盘价,最高价,最低价,收盘价,成交量)数据。我从1'DF我想生成一个N'DataFrame。然后,我必须:
a) open: sample the original['opens'] column, every N rows.
b) high: get the maximum of every size N group of consecutive rows.
c) low: get the minimum of every size N group of consecutive rows.
d) close: sample the original['opens'] column, every N rows, starting at N-1.
e) vols: sum every size N group of consecutive rows.
为此,我编写了以下代码片段(newCandleSize = N):
# Create the new DataFrame
df = pd.DataFrame()
# subsample to get the opening values
df['open'] = dfOrig['open'].iloc[::newCandleSize]
# generate artificial groups to get the high and low
tmpRange = np.arange(len(dfOrig)) // newCandleSize
df['high'] = dfOrig['high'].groupby(tmpRange).max()
df['low'] = dfOrig['low'].groupby(tmpRange).min()
# subsample to get the closing values
df['close'] = dfOrig['close'].iloc[newCandleSize - 1::newCandleSize]
# generate artificial groups to get the vol
df['vol'] = dfOrig['vol'].groupby(tmpRange).sum()
要显示结果:在原始DataFrame上,我得到了
open high low close vol
0 5.800781 6.0 5.800781 6.0 25.0
1 5.800781 6.0 5.800781 6.0 0.0
2 5.800781 6.0 5.800781 6.0 0.0
而在nwe DataFrame中,N = 3,我得到:
open high low close vol
0 5.800781 6.000000 5.800781 NaN 25.0
3 5.800781 6.000000 5.800781 NaN 0.0
6 5.800781 6.000000 5.800781 NaN 0.0
然后,我在这里看到两个问题:
谢谢!