我正在尝试每天遍历数据帧。
我的数据如下:
open high low close volume
date
2019-12-18 09:15:00+05:30 182.10 182.30 180.55 181.30 4252638
2019-12-18 09:30:00+05:30 181.30 183.45 181.00 183.20 5869850
2019-12-18 09:45:00+05:30 183.35 184.50 183.05 183.25 5201947
2019-12-18 10:00:00+05:30 183.25 183.30 182.45 182.90 2029440
2019-12-18 10:15:00+05:30 182.95 183.25 181.50 182.00 2613453
... ... ... ... ... ...
2019-12-24 14:15:00+05:30 175.40 175.70 175.10 175.40 480322
2019-12-24 14:30:00+05:30 175.40 175.60 174.65 174.80 1193108
2019-12-24 14:45:00+05:30 174.80 176.10 174.75 175.55 1765370
2019-12-24 15:00:00+05:30 175.50 175.75 174.90 175.50 1369208
2019-12-24 15:15:00+05:30 175.45 175.75 175.20 175.40 2010583
我尝试过
(df['date'] >= "18-12-2019 09:00:00") & (df['date'] <= "18-12-2019 16:00:00")
但是我不希望特定日期的数据,我想根据日期将当前数据帧拆分为多个数据帧并将其存储在数组中。我该怎么办?
预期输出:
res = [] # list of dataframes length = number of days
res[0] =
open high low close volume
date
2019-12-18 09:15:00+05:30 182.10 182.30 180.55 181.30 4252638
2019-12-18 09:30:00+05:30 181.30 183.45 181.00 183.20 5869850
2019-12-18 09:45:00+05:30 183.35 184.50 183.05 183.25 5201947
2019-12-18 10:00:00+05:30 183.25 183.30 182.45 182.90 2029440
2019-12-18 10:15:00+05:30 182.95 183.25 181.50 182.00 2613453
res[1] =
open high low close volume
date
2019-12-19 09:15:00+05:30 182.10 182.30 180.55 181.30 4252638
2019-12-19 09:30:00+05:30 181.30 183.45 181.00 183.20 5869850
2019-12-19 09:45:00+05:30 183.35 184.50 183.05 183.25 5201947
2019-12-19 10:00:00+05:30 183.25 183.30 182.45 182.90 2029440
2019-12-19 10:15:00+05:30 182.95 183.25 181.50 182.00 2613453
...
...
res[n] =
2019-12-24 14:15:00+05:30 175.40 175.70 175.10 175.40 480322
2019-12-24 14:30:00+05:30 175.40 175.60 174.65 174.80 1193108
2019-12-24 14:45:00+05:30 174.80 176.10 174.75 175.55 1765370
2019-12-24 15:00:00+05:30 175.50 175.75 174.90 175.50 1369208
2019-12-24 15:15:00+05:30 175.45 175.75 175.20 175.40 2010583
将不同的日期数据分散并放置在数组中。
答案 0 :(得分:1)
您可以尝试groupby
方法并按照自己喜欢的方式聚合所有其他列。例如:
df.groupby(df.date.dt.date)['open'].sum()
OHLC将变为:
df.groupby(df.date.dt.date).agg({
'open': 'first',
'high': 'max',
'low': 'min',
'close': 'last',
})