以下是一个非常大的数据框的小样本:
import pandas as pd
In [32]: df3
Out[32]:
Location_ID Time
0 10000000568366 2012-05-31 14:08:00
1 10000000257225 2012-05-31 07:22:00
2 10000000730693 2012-05-31 02:19:00
3 10000000257225 2012-05-30 12:20:00
4 10000001072890 2012-05-30 11:19:00
5 10000000811587 2012-05-31 03:09:00
6 10000000094837 2012-06-02 08:39:00
7 10000000730693 2012-06-01 14:04:00
8 10000000955747 2012-05-31 07:24:00
9 10000000924241 2012-05-30 14:48:00
10 10000000893286 2012-05-18 13:12:00
11 10000000924241 2012-05-31 01:56:00
12 10000000211696 2012-05-30 02:09:00
13 10000000211696 2012-05-29 11:41:00
14 10000000084450 2012-05-31 18:34:00
15 10000000939505 2012-06-02 18:12:00
16 10000000893286 2012-05-31 22:54:00
17 10000000811598 2012-06-01 07:55:00
18 10000000683255 2012-05-29 03:44:00
我试图找到特定Location_ID的连续“时间”行之间的时间差。我正在使用pandas.to_numeric将其转换为纳秒,然后将其除以1000000000以在几秒钟内得到结果:
df4 = df3.assign(time_difference=df3['Time'].groupby('Location_ID').apply(lambda x : (pd.to_numeric(x.shift()-x).abs())/1000000000))
我得到的错误是:
KeyError: 'Location_ID'