df
group1 group2 date
1 a b 2017-01-01
2 b c 2017-01-05
3 d b 2017-01-07
4 c a 2017-01-10
5 a d 2017-01-15
df期望
group1 group2 grp1_diff_days grp2_diff_days
1 a b NaN NaN
2 b c 4 NaN
3 d b NaN 2
4 c a 5 9
5 a d 5 8
我想获取一个组的天数差异,然后将该值放在各自的列['grp i _diff_days']中,而不管它们之前是否来自另一个组。
答案 0 :(得分:1)
设置 (假设您的date
列已经为datetime
)
df = pd.DataFrame({'group1': {1: 'a', 2: 'b', 3: 'd', 4: 'c', 5: 'a'},
'group2': {1: 'b', 2: 'c', 3: 'b', 4: 'a', 5: 'd'},
'date': {1: pd.Timestamp('2017-01-01 00:00:00'),
2: pd.Timestamp('2017-01-05 00:00:00'),
3: pd.Timestamp('2017-01-07 00:00:00'),
4: pd.Timestamp('2017-01-10 00:00:00'),
5: pd.Timestamp('2017-01-15 00:00:00')}})
s = df.set_index('date').stack().rename('value').reset_index(0)
d = pd.DataFrame(s.groupby('value').date.diff().values.reshape(-1, 2),
columns=['g1diff', 'g2diff'],
index=df.index)
df.join(d)
group1 group2 date g1diff g2diff
1 a b 2017-01-01 NaT NaT
2 b c 2017-01-05 4 days NaT
3 d b 2017-01-07 NaT 2 days
4 c a 2017-01-10 5 days 9 days
5 a d 2017-01-15 5 days 8 days