鉴于以下数据:
Sum amount_net amount_gross symbol Date_Time
ts
7/29/2013 2:17 -68 755,101 -755,101 A 7/29/2013 2:17
7/29/2013 2:17 -21 251,945 -251,945 B 7/29/2013 2:17
7/29/2013 2:16 -1 2,200 -2,200 C 7/29/2013 2:16
7/29/2013 2:17 -5 11,000 -11,000 C 7/29/2013 2:17
7/29/2013 2:08 -1 5,384 -5,384 D 7/29/2013 2:08
7/29/2013 2:09 -3 16,151 -16,151 D 7/29/2013 2:09
7/29/2013 2:13 1 5,384 5,384 D 7/29/2013 2:13
7/29/2013 2:02 20 70,000 70,000 F 7/29/2013 2:02
7/29/2013 2:03 22 77,000 77,000 F 7/29/2013 2:03
7/29/2013 2:04 18 63,000 63,000 F 7/29/2013 2:04
7/29/2013 2:05 15 52,500 52,500 F 7/29/2013 2:05
7/29/2013 2:08 15 52,500 52,500 F 7/29/2013 2:08
7/29/2013 2:09 8 28,000 28,000 F 7/29/2013 2:09
7/29/2013 2:10 22 77,000 77,000 F 7/29/2013 2:10
7/29/2013 2:11 22 77,000 77,000 F 7/29/2013 2:11
7/29/2013 2:12 12 42,000 42,000 F 7/29/2013 2:12
7/29/2013 2:13 5 17,500 17,500 F 7/29/2013 2:13
7/29/2013 2:14 30 105,000 105,000 F 7/29/2013 2:14
7/29/2013 2:15 35 122,500 122,500 F 7/29/2013 2:15
7/29/2013 2:16 35 122,500 122,500 F 7/29/2013 2:16
我希望在该符号的最大时间返回每个符号,sum,amount_net和amount_gross。即我想得到:
symbol Time Sum amount_net amount_gross
A 7/29/2013 2:17 -68 755,101 -755,101
B 7/29/2013 2:17 -21 251,945 -251,945
C 7/29/2013 2:17 -5 11,000 -11,000
D 7/29/2013 2:13 1 5,384 5,384
F 7/29/2013 2:16 35 122,500 122,500
答案 0 :(得分:2)
按时间顺序排序,逐个符号排序,然后从每个组中取最后一个(也就是说“最长时间”)元素。
In [28]: df.sort('Date_Time').groupby('symbol').last()
Out[28]:
Date_Time Sum amount_net amount_gross
symbol
A 2013-07-29 02:17:00 -68 755101 -755101
B 2013-07-29 02:17:00 -21 251945 -251945
C 2013-07-29 02:17:00 -5 11000 -11000
D 2013-07-29 02:13:00 1 5384 5384
F 2013-07-29 02:16:00 35 122500 122500
请参阅@Andy关于将数字解析为整数的说法。
答案 1 :(得分:0)
简单地分组符号和总和:
In [11]: df1.groupby('symbol').sum()
Out[11]:
Sum amount_net amount_gross
symbol
A -68 755101 -755101
B -21 251945 -251945
C -6 13200 -13200
D -3 26919 -16151
F 259 906500 906500
注意:atm它看起来像amount_net
和amount_gross
没有被正确解析为整数,而是它们是字符串,但你可以使用转换:
df1[['amount_net', 'amount_gross']] = df1[['amount_net', 'amount_gross']].applymap(lambda x: int(x.replace(',', '')))