我有一个数据框:
Data_c User Rank sequence_in_progress
15-03-2017 2 0 0
15-03-2017 1 1 0
16-03-2017 2 0 0
17-03-2017 2 1 0
18-03-2017 1 0 0
现在我将替换数据框中的“sequence_in_progress”,探索它,考虑加入日期和已加入的用户的序列。
基本上,结果应该是:
Data_c User Rank sequence_in_progress
15-03-2017 2 0 1
15-03-2017 1 1 1
16-03-2017 2 0 2
17-03-2017 2 1 3
18-03-2017 1 0 2
基本上,“sequence_in_progress”表示用户“x”在给定日期选择某事的顺序。
提前感谢您的帮助
答案 0 :(得分:1)
我会使用pandas groupby。 请注意,此解决方案适用于任意数量的用户。
cc = ['Data_c', 'User', 'Rank']
vals = [['15-03-2017', 2, 0],
['15-03-2017', 1, 1],
['16-03-2017', 2, 0],
['17-03-2017', 2, 1],
['18-03-2017', 1, 0]]
frame = pd.DataFrame(vals, columns = cc)
# Crete the sequence (1,...,N) for each user
users_sequence = [group.assign(sequence = range(1, len(group)+1))
for key, group in frame.groupby('User')]
# Put everything together, using reindex to have same order as the original frame
result = pd.concat(users_sequence, axis = 0).reindex(frame.index)
Data_c User Rank sequence
0 15-03-2017 2 0 1
1 15-03-2017 1 1 1
2 16-03-2017 2 0 2
3 17-03-2017 2 1 3
4 18-03-2017 1 0 2