Question

我有一个df，按AccountID和PurchaseDate排序。我想做的是计算并创建PurchaseDate每组中AccountID之间差异的新列。

AccountID       PurchaseDate                 Price
| 113        2018-09-01 22:56:30              13|
| 113        2018-09-02 22:56:30              19|
| 114        2018-09-01 22:56:30              20|
| 114        2018-09-03 22:56:30              25|

到

AccountID       PurchaseDate                 Price          DateDiff
| 113        2018-09-01 22:56:30              13              null|
| 113        2018-09-02 22:56:30              19               1  |
| 114        2018-09-01 22:56:30              20              null|
| 114        2018-09-03 22:56:30              25               2  |

Answer 1

您可以这样做：

df['DateDiff'] = df.groupby('AccountID')['PurchaseDate'].\
                    diff().apply(lambda x: x.days)

Answer 2

这是如何做到的完整示例：

import pandas as pd

df = pd.DataFrame({'AccountID': [113, 113, 114, 114],
                   'PurchaseDate': ['2018-09-01 22:56:30',
                                    '2018-09-02 22:56:30',
                                    '2018-09-01 22:56:30',
                                    '2018-09-03 22:56:30'],
                   'Price': [13, 19, 20, 25]})

df['PurchaseDate'] = pd.to_datetime(df['PurchaseDate'])
df['DateDiff'] = df.groupby('AccountID').PurchaseDate.diff().fillna(0)
#    AccountID  Price        PurchaseDate DateDiff
# 0        113     13 2018-09-01 22:56:30   0 days
# 1        113     19 2018-09-02 22:56:30   1 days
# 2        114     20 2018-09-01 22:56:30   0 days
# 3        114     25 2018-09-03 22:56:30   2 days

打开评论。

如何在Python组中计算日期时间之间的差异？

2 个答案: