我想将时间日期时间用作主索引,但在那里有很多重复项。我想要的是添加人工毫秒,在每组秒内作为“计数器”。
例如 - 原始数据框如下所示:
Bid BidVol
2016-06-27 13:00:10 4183.50 0
2016-06-27 13:00:10 4183.50 0
2016-06-27 13:00:10 4183.50 0
2016-06-28 13:00:10 4249.25 1
2016-06-28 13:00:10 4249.25 1
2016-06-28 13:00:10 4249.00 1
2016-06-28 13:00:10 4248.75 1
2016-06-28 13:00:10 4248.75 2
2016-06-28 13:00:10 4248.75 1
2016-06-28 13:00:10 4248.75 2
2016-06-28 13:00:12 4248.50 0
2016-06-28 13:00:12 4248.50 0
2016-06-29 13:00:12 4353.75 0
2016-06-29 13:00:12 4353.75 0
2016-06-29 13:00:12 4353.75 0
2016-06-29 13:00:12 4354.00 1
2016-06-29 13:00:12 4354.00 1
2016-06-29 13:00:12 4353.75 0
2016-06-29 13:00:12 4354.00 1
2016-06-29 13:00:12 4354.00 1
2016-06-29 13:00:12 4354.00 1
2016-06-29 13:00:12 4354.00 1
2016-06-30 13:00:10 4394.00 0
2016-06-30 13:00:11 4394.25 1
2016-06-30 13:00:11 4394.00 0
我的目标是将双重行更改为:
2016-06-28 13:00:10
2016-06-28 13:00:10.001000
2016-06-28 13:00:10.002000
2016-06-28 13:00:10.003000
2016-06-28 13:00:10.004000
2016-06-28 13:00:10.005000
2016-06-28 13:00:10.006000
我试图使用groupby函数,我可以使用循环打印毫秒:
for name, group in test.groupby(test.index):
print ('------')
i=0
for idx, values in group.iterrows():
print (idx+pd.Timedelta(milliseconds=i))
i+=1
但是我不知道如何更改索引是获得我需要的结果的最有效方法?特别是考虑到效率(主要数据集非常大)。
答案 0 :(得分:2)
您可以使用cumcount
创建ms
,将其转换为to_timedelta
并添加到index
:
a = df.groupby(level=0).cumcount()
print (a)
2016-06-27 13:00:10 0
2016-06-27 13:00:10 1
2016-06-27 13:00:10 2
2016-06-28 13:00:10 0
2016-06-28 13:00:10 1
2016-06-28 13:00:10 2
2016-06-28 13:00:10 3
2016-06-28 13:00:10 4
2016-06-28 13:00:10 5
2016-06-28 13:00:10 6
2016-06-28 13:00:12 0
2016-06-28 13:00:12 1
2016-06-29 13:00:12 0
2016-06-29 13:00:12 1
2016-06-29 13:00:12 2
2016-06-29 13:00:12 3
2016-06-29 13:00:12 4
2016-06-29 13:00:12 5
2016-06-29 13:00:12 6
2016-06-29 13:00:12 7
2016-06-29 13:00:12 8
2016-06-29 13:00:12 9
2016-06-30 13:00:10 0
2016-06-30 13:00:11 0
2016-06-30 13:00:11 1
dtype: int64
df.index = df.index + pd.to_timedelta(a, unit='ms')
print (df)
Bid BidVol
2016-06-27 13:00:10.000 4183.50 0
2016-06-27 13:00:10.001 4183.50 0
2016-06-27 13:00:10.002 4183.50 0
2016-06-28 13:00:10.000 4249.25 1
2016-06-28 13:00:10.001 4249.25 1
2016-06-28 13:00:10.002 4249.00 1
2016-06-28 13:00:10.003 4248.75 1
2016-06-28 13:00:10.004 4248.75 2
2016-06-28 13:00:10.005 4248.75 1
2016-06-28 13:00:10.006 4248.75 2
2016-06-28 13:00:12.000 4248.50 0
2016-06-28 13:00:12.001 4248.50 0
2016-06-29 13:00:12.000 4353.75 0
2016-06-29 13:00:12.001 4353.75 0
2016-06-29 13:00:12.002 4353.75 0
2016-06-29 13:00:12.003 4354.00 1
2016-06-29 13:00:12.004 4354.00 1
2016-06-29 13:00:12.005 4353.75 0
2016-06-29 13:00:12.006 4354.00 1
2016-06-29 13:00:12.007 4354.00 1
2016-06-29 13:00:12.008 4354.00 1
2016-06-29 13:00:12.009 4354.00 1
2016-06-30 13:00:10.000 4394.00 0
2016-06-30 13:00:11.000 4394.25 1
2016-06-30 13:00:11.001 4394.00 0