如何在Python组中计算日期时间之间的差异?

时间:2019-01-11 17:29:17

标签: python pandas numpy datetime pandas-groupby

我有一个df,按AccountIDPurchaseDate排序。我想做的是计算并创建PurchaseDate每组中AccountID之间差异的新列。

AccountID       PurchaseDate                 Price
| 113        2018-09-01 22:56:30              13|
| 113        2018-09-02 22:56:30              19|
| 114        2018-09-01 22:56:30              20|
| 114        2018-09-03 22:56:30              25|

AccountID       PurchaseDate                 Price          DateDiff
| 113        2018-09-01 22:56:30              13              null|
| 113        2018-09-02 22:56:30              19               1  |
| 114        2018-09-01 22:56:30              20              null|
| 114        2018-09-03 22:56:30              25               2  |

2 个答案:

答案 0 :(得分:2)

您可以这样做:

df['DateDiff'] = df.groupby('AccountID')['PurchaseDate'].\
                    diff().apply(lambda x: x.days)

答案 1 :(得分:1)

这是如何做到的完整示例:

import pandas as pd

df = pd.DataFrame({'AccountID': [113, 113, 114, 114],
                   'PurchaseDate': ['2018-09-01 22:56:30',
                                    '2018-09-02 22:56:30',
                                    '2018-09-01 22:56:30',
                                    '2018-09-03 22:56:30'],
                   'Price': [13, 19, 20, 25]})

df['PurchaseDate'] = pd.to_datetime(df['PurchaseDate'])
df['DateDiff'] = df.groupby('AccountID').PurchaseDate.diff().fillna(0)
#    AccountID  Price        PurchaseDate DateDiff
# 0        113     13 2018-09-01 22:56:30   0 days
# 1        113     19 2018-09-02 22:56:30   1 days
# 2        114     20 2018-09-01 22:56:30   0 days
# 3        114     25 2018-09-03 22:56:30   2 days

打开评论。