如何在Pandas DataFrame中聚合行,每列使用不同的聚合运算符?

时间:2017-11-26 11:20:03

标签: python pandas dataframe aggregate resampling

我有一个pandas DataFrame,其中包含来自特定交易所的1分钟OHLCV蜡烛(开盘价,最高价,最低价,收盘价,成交量)数据。我从1'DF我想生成一个N'DataFrame。然后,我必须:

a) open: sample the original['opens'] column, every N rows.
b) high: get the maximum of every size N group of consecutive rows.
c) low: get the minimum of every size N group of consecutive rows.
d) close: sample the original['opens'] column, every N rows, starting at N-1.
e) vols: sum every size N group of consecutive rows.

为此,我编写了以下代码片段(newCandleSize = N):

# Create the new DataFrame
df = pd.DataFrame()

# subsample to get the opening values
df['open'] = dfOrig['open'].iloc[::newCandleSize]    

# generate artificial groups to get the high and low
tmpRange = np.arange(len(dfOrig)) // newCandleSize
df['high'] = dfOrig['high'].groupby(tmpRange).max()
df['low'] = dfOrig['low'].groupby(tmpRange).min()

# subsample to get the closing values
df['close'] = dfOrig['close'].iloc[newCandleSize - 1::newCandleSize]

# generate artificial groups to get the vol
df['vol'] = dfOrig['vol'].groupby(tmpRange).sum()

要显示结果:在原始DataFrame上,我得到了

       open  high       low  close   vol
0   5.800781   6.0  5.800781    6.0  25.0
1   5.800781   6.0  5.800781    6.0   0.0
2   5.800781   6.0  5.800781    6.0   0.0

而在nwe DataFrame中,N = 3,我得到:

        open      high       low  close   vol
0   5.800781  6.000000  5.800781    NaN  25.0
3   5.800781  6.000000  5.800781    NaN   0.0
6   5.800781  6.000000  5.800781    NaN   0.0

然后,我在这里看到两个问题:

  1. 为什么我会在“关闭”列中获得NaN?
  2. 如何将行索引设置为连续,从零开始?我应该把它固定好吗?
  3. 谢谢!

0 个答案:

没有答案