在python索引数据框中的不同列的行之间减去

时间:2018-10-23 15:24:38

标签: python pandas indexing

我有一个已建立索引的数据帧(先按类型再按日期进行索引),并希望在上一行的结束时间与下一行的开始时间(以小时为单位)之间进行减法运算:

type    date             start_time                  end_time       code
A      01/01/2018         01/01/2018 9:00       01/01/2018 14:00      525
       01/02/2018         01/02/2018 5:00       01/02/2018 17:00      524
       01/04/2018         01/04/2018 8:00       01/04/2018 10:00      528
B      01/01/2018         01/01/2018 5:00       01/01/2018 14:00      525
       01/04/2018         01/04/2018 2:00       01/04/2018 17:00      524
       01/05/2018         01/05/2018 7:00       01/05/2018 10:00      528

我想用新列['interval']来获得结果表:

type    date             interval
A      01/01/2018           -
       01/02/2018           15
       01/04/2018           39
B      01/01/2018           -
       01/04/2018           60
       01/05/2018           14

间隔列以小时为单位

1 个答案:

答案 0 :(得分:0)

您可以将start_timeend_time转换为日期时间格式,然后使用apply减去每组中前一行的end_time(使用groupby )。要换算为小时,请除以pd.Timedelta('1 hour')

df['start_time'] = pd.to_datetime(df['start_time'])
df['end_time'] = pd.to_datetime(df['end_time'])

df['interval'] = (df.groupby(level=0,sort=False).apply(lambda x: x.start_time-x.end_time.shift(1)) / pd.Timedelta('1 hour')).values

>>> df
                         start_time            end_time    code    interval
type date                                                              
A    01/01/2018  2018-01-01 09:00:00  2018-01-01 14:00:00   525       NaN
     01/02/2018  2018-01-02 05:00:00  2018-01-02 17:00:00   524      15.0
     01/04/2018  2018-01-04 08:00:00  2018-01-04 10:00:00   528      39.0
B    01/01/2018  2018-01-01 05:00:00  2018-01-01 14:00:00   525       NaN
     01/04/2018  2018-01-04 02:00:00  2018-01-04 17:00:00   524      60.0
     01/05/2018  2018-01-05 07:00:00  2018-01-05 10:00:00   528      14.0