我有一张看起来像这样的桌子。
msno date num_25 num_50 num_75 num_985 num_100 num_unq \
0 rxIP2f2aN0rYNp+toI0Obt/N/FYQX8hcO1fTmmy2h34= 20150513 0 0 0 0 1 1
1 rxIP2f2aN0rYNp+toI0Obt/N/FYQX8hcO1fTmmy2h34= 20150709 9 1 0 0 7 11
2 yxiEWwE9VR5utpUecLxVdQ5B7NysUPfrNtGINaM2zA8= 20150105 3 3 0 0 68 36
3 yxiEWwE9VR5utpUecLxVdQ5B7NysUPfrNtGINaM2zA8= 20150306 1 0 1 1 97 27
4 yxiEWwE9VR5utpUecLxVdQ5B7NysUPfrNtGINaM2zA8= 20150501 3 0 0 0 38 38
5 yxiEWwE9VR5utpUecLxVdQ5B7NysUPfrNtGINaM2zA8= 20150702 4 0 1 1 33 10
6 yxiEWwE9VR5utpUecLxVdQ5B7NysUPfrNtGINaM2zA8= 20150830 3 1 0 0 4 7
7 yxiEWwE9VR5utpUecLxVdQ5B7NysUPfrNtGINaM2zA8= 20151107 1 0 0 0 4 5
8 yxiEWwE9VR5utpUecLxVdQ5B7NysUPfrNtGINaM2zA8= 20160110 2 0 1 0 11 6
9 yxiEWwE9VR5utpUecLxVdQ5B7NysUPfrNtGINaM2zA8= 20160316 9 3 4 1 67 50
10 yxiEWwE9VR5utpUecLxVdQ5B7NysUPfrNtGINaM2zA8= 20160510 5 3 2 1 67 66
11 yxiEWwE9VR5utpUecLxVdQ5B7NysUPfrNtGINaM2zA8= 20160804 1 4 5 0 36 43
12 yxiEWwE9VR5utpUecLxVdQ5B7NysUPfrNtGINaM2zA8= 20160926 7 1 0 1 38 20
13 yxiEWwE9VR5utpUecLxVdQ5B7NysUPfrNtGINaM2zA8= 20161115 0 1 4 1 38 40
14 yxiEWwE9VR5utpUecLxVdQ5B7NysUPfrNtGINaM2zA8= 20170106 0 0 0 1 39 38
15 PNxIsSLWOJDCm7pNPFzRO/6Mmg2WeZA2nf6hw6t1x3g= 20151201 3 3 2 0 8 11
16 PNxIsSLWOJDCm7pNPFzRO/6Mmg2WeZA2nf6hw6t1x3g= 20160628 0 0 1 1 1 3
17 PNxIsSLWOJDCm7pNPFzRO/6Mmg2WeZA2nf6hw6t1x3g= 20170106 2 1 0 0 35 34
18 KXF9c/T66LZIzFq+xS64icWMhDQE6miCZAtdXRjZHX8= 20150803 0 0 0 0 16 11
19 KXF9c/T66LZIzFq+xS64icWMhDQE6miCZAtdXRjZHX8= 20160527 4 3 0 2 2 11
20 KXF9c/T66LZIzFq+xS64icWMhDQE6miCZAtdXRjZHX8= 20160808 14 3 4 1 15 31
我应该如何总结列'num_25', 'num_50', 'num_75', 'num_985', 'num_100', 'num_unq', 'total_secs'
以获得总数并且只留下一个唯一的msno数字?
例如,在对所有相同的msno数字行进行分组后,它将生成下面的结果,丢弃日期列。
msno num_25 num_50 num_75 num_985 num_100 num_unq \
0 rxIP2f2aN0rYNp+toI0Obt/N/FYQX8hcO1fTmmy2h34= 9 1 0 0 8 12
我尝试了这个,但msno仍然重复,日期列仍在那里。
df_user_logs_v2.groupby(['msno', 'date'])['num_25', 'num_50', 'num_75', 'num_985', 'num_100', 'num_unq', 'total_secs'].sum()