熊猫:获取每行的ohlc数据

时间:2016-03-03 16:23:59

标签: python numpy pandas dataframe

我有DataFramedatetime索引和价格列。我想要ohlc数据。 (开,高,低,收盘)

我想以每行的给定频率重新采样此数据帧。

frame.resample('60S', how = 'ohlc')有效,但现在数据帧的索引间隔60秒。我想在前60年代的每一行中重新采样。 (如果指数相差5s,则为12)。这样我就可以为原始数据帧中的每一行提供ohlc值。

我不认为我可以使用df.resample,但可能使用.agg.map

如何获取每个行的ohlc数据?

n = 10000

prices = np.linspace(100.0, 103.0, n) + np.random.normal(0.0, 0.3, n)
f = pd.DataFrame({'price': prices}, index = pd.date_range(end = datetime.utcnow(), freq = '5S', periods = n))

ohlcized = f.resample('60S', how = 'ohlc') # resampling doesnt work (834 != 10000)
len(ohlcized) # 834
len(f) # 10000

if len(ohlcized) == len(f):
    print "question answered"

1 个答案:

答案 0 :(得分:2)

对于等间隔时间戳:

bars = 12  
df = pd.concat([f.shift(bars - 1), pd.rolling_max(f, bars), pd.rolling_min(f, bars), f], 
               axis=1)
df.columns = ['Open', 'High', 'Low', 'Close']

>>> df.tail()
                                  Open        High         Low       Close
2016-03-03 19:20:49.336236  102.510446  103.603518  102.438872  102.810945
2016-03-03 19:20:54.336236  102.916919  103.603518  102.438872  103.072880
2016-03-03 19:20:59.336236  103.603518  103.603518  102.438872  103.290665
2016-03-03 19:21:04.336236  102.966331  103.290665  102.438872  103.095781
2016-03-03 19:21:09.336236  102.438872  103.409546  102.438872  103.409546