Lets consider a pandas DataFrame defined as follow:
from decimal import Decimal
from pandas import Timestamp
dic={'volume': {('CSC', Timestamp('2016-08-05 00:00:00'), 'CSCF7'): Decimal('13'),
('CSC', Timestamp('2016-08-05 00:00:00'), 'CSCG7'): Decimal('6'),
('CSC', Timestamp('2016-08-05 00:00:00'), 'CSCH7'): Decimal('12'),
('DA', Timestamp('2016-08-05 00:00:00'), 'DCF7'): Decimal('47'),
('DA', Timestamp('2016-08-05 00:00:00'), 'DCG7'): Decimal('16'),
('DA', Timestamp('2016-08-05 00:00:00'), 'DCH7'): Decimal('27')
}}
df=pd.DataFrame(dic)
I would like to transform it so that it returns the highest value of the 3rd index level . For instance in the current example that would be:
highest
CSC 2016-08-05 CSCF7
DA 2016-08-05 DCF7
Anyone has an idea how to perform that?
答案 0 :(得分:1)
You could groupby
on level
and take idxmax
In [317]: df.groupby(level=0).idxmax()
Out[317]:
volume
CSC (CSC, 2016-08-05 00:00:00, CSCF7)
DA (DA, 2016-08-05 00:00:00, DCF7)
In [318]: df.groupby(level=0).idxmax().volume.apply(pd.Series)
Out[318]:
0 1 2
CSC CSC 2016-08-05 CSCF7
DA DA 2016-08-05 DCF7
Or,
In [338]: df.groupby(level=[0, 1]).volume.idxmax().apply(lambda x: x[-1])
Out[338]:
CSC 2016-08-05 CSCF7
DA 2016-08-05 DCF7
Name: volume, dtype: object
Or,
In [341]: df.groupby(level=[0, 1]).volume.idxmax().str[-1]
Out[341]:
CSC 2016-08-05 CSCF7
DA 2016-08-05 DCF7
Name: volume, dtype: object