如何计算两个熊猫列之间的时差

时间:2018-07-17 12:22:28

标签: python pandas dataframe data-analysis

我的df看起来像

    start               stop
0   2015-11-04 10:12:00 2015-11-06 06:38:00
1   2015-11-04 10:23:00 2015-11-05 08:30:00
2   2015-11-04 14:01:00 2015-11-17 10:34:00
4   2015-11-19 01:43:00 2015-12-21 09:04:00

print(time_df.dtypes)

start       datetime64[ns]
stop        datetime64[ns]

dtype:对象

我正在尝试查找停止和开始之间的时差。

我尝试过,pd.Timedelta(df_time['stop']-df_time['start']) 但它给出了TypeError: data type "datetime" not understood

df_time['stop']-df_time['start']也给出相同的错误。

我的预期输出,

 2D,?H
 1D,?H
 ...
 ...

2 个答案:

答案 0 :(得分:1)

您需要省略pd.Timedelta,因为时间差返回timedelta:

df_time['td'] = df_time['stop']-df_time['start']
print (df_time)
                start                stop               td
0 2015-11-04 10:12:00 2015-11-06 06:38:00  1 days 20:26:00
1 2015-11-04 10:23:00 2015-11-05 08:30:00  0 days 22:07:00
2 2015-11-04 14:01:00 2015-11-17 10:34:00 12 days 20:33:00

编辑:另一个解决方案是减去numpy数组:

df_time['td'] = df_time['stop'].values - df_time['start'].values
print (df_time)
                start                stop               td
0 2015-11-04 10:12:00 2015-11-06 06:38:00  1 days 20:26:00
1 2015-11-04 10:23:00 2015-11-05 08:30:00  0 days 22:07:00
2 2015-11-04 14:01:00 2015-11-17 10:34:00 12 days 20:33:00

答案 1 :(得分:1)

首先请确保列中有日期

data.loc[:, 'start'] = pd.to_datetime(data.loc[:, 'start'])
data.loc[:, 'stop'] = pd.to_datetime(data.loc[:, 'stop'])

然后减去

data['delta'] = data['stop'] - data['start']