必须有一种精益的方式来做到这一点:
数据帧
t, ID
700, 1
900, 1
1000, 1
1100, 1
300, 2
100, 3
200, 3
预期结果:
elapsed, visits, 1/f, ID
400, 4, 100, 1
0, 1, 0, 2
100, 2, 50, 3
是否使用groupby
? resample
?我应该提供身份证index
吗?
答案 0 :(得分:3)
In [88]: result = df.groupby(['ID'])['t'].agg(['min', 'max', 'count'])
In [89]: result['elapsed'] = result['max']-result['min']
In [90]: result['1/f'] = result['elapsed']/result['count']
In [91]: result = result[['elapsed','count', '1/f']].rename(columns={'count':'visits'})
In [92]: result = result.reset_index()
In [93]: result
Out[93]:
ID elapsed visits 1/f
0 1 400 4 100
1 2 0 1 0
2 3 100 2 50