Question

我有一个pandas DataFrame，其中包含来自特定交易所的1分钟OHLCV蜡烛（开盘价，最高价，最低价，收盘价，成交量）数据。我从1'DF我想生成一个N'DataFrame。然后，我必须：

a) open: sample the original['opens'] column, every N rows.
b) high: get the maximum of every size N group of consecutive rows.
c) low: get the minimum of every size N group of consecutive rows.
d) close: sample the original['opens'] column, every N rows, starting at N-1.
e) vols: sum every size N group of consecutive rows.

为此，我编写了以下代码片段（newCandleSize = N）：

# Create the new DataFrame
df = pd.DataFrame()

# subsample to get the opening values
df['open'] = dfOrig['open'].iloc[::newCandleSize]    

# generate artificial groups to get the high and low
tmpRange = np.arange(len(dfOrig)) // newCandleSize
df['high'] = dfOrig['high'].groupby(tmpRange).max()
df['low'] = dfOrig['low'].groupby(tmpRange).min()

# subsample to get the closing values
df['close'] = dfOrig['close'].iloc[newCandleSize - 1::newCandleSize]

# generate artificial groups to get the vol
df['vol'] = dfOrig['vol'].groupby(tmpRange).sum()

要显示结果：在原始DataFrame上，我得到了

       open  high       low  close   vol
0   5.800781   6.0  5.800781    6.0  25.0
1   5.800781   6.0  5.800781    6.0   0.0
2   5.800781   6.0  5.800781    6.0   0.0

而在nwe DataFrame中，N = 3，我得到：

        open      high       low  close   vol
0   5.800781  6.000000  5.800781    NaN  25.0
3   5.800781  6.000000  5.800781    NaN   0.0
6   5.800781  6.000000  5.800781    NaN   0.0

然后，我在这里看到两个问题：

为什么我会在“关闭”列中获得NaN？
如何将行索引设置为连续，从零开始？我应该把它固定好吗？

谢谢！

如何在Pandas DataFrame中聚合行，每列使用不同的聚合运算符？

0 个答案: