我有3列主要是国家,类型(投资类型)和金额。我想知道每个投资类型的投资额最大的国家。所以预期的国家名单是" can,gb,ind"。
import pandas as pd
import numpy as np
df = pd.DataFrame({"country": ["ind", "usa", "gb", "ind", "gb", "usa", "can", "can", "usa", "ind", "gb", "can"], \
"type":["deposit", "bonds", "cash", "cash", "bonds", "deposit", "bonds", "deposit", "deposit", "bonds", "cash", "deposit"], \
"amount": [1000, 120, 90, 200, 150, 300, 100, 400, 250, 300, 250, 5000]})
print(df)
print(df.groupby("type")["amount"].max())
##How to get the corresponding coutry per max amount of the investment type?
amount country type
0 1000 ind deposit
1 120 usa bonds
2 90 gb cash
3 200 ind cash
4 150 gb bonds
5 300 usa deposit
6 100 can bonds
7 400 can deposit
8 250 usa deposit
9 300 ind bonds
10 250 gb cash
11 5000 can deposit
type
bonds 300
cash 250
deposit 5000
Name: amount, dtype: int64
我可以按投资类型对其进行分组并计算最大值,但是如何提取相应的国家/地区名称?
答案 0 :(得分:3)
您可以使用drop_duplicates
df.sort_values(['type','amount']).drop_duplicates('type',keep='last')
Out[285]:
amount country type
9 300 ind bonds
10 250 gb cash
11 5000 can deposit
或仅使用idxmax
df.loc[df.groupby('type')['amount'].idxmax()]
Out[287]:
amount country type
9 300 ind bonds
10 250 gb cash
11 5000 can deposit
答案 1 :(得分:0)
您需要在type
子句中添加groupby
和df.groupby(['country','type'])['amount'].max().reset_index()
。
country type amount
0 can bonds 100
1 can deposit 5000
2 gb bonds 150
3 gb cash 250
4 ind bonds 300
5 ind cash 200
6 ind deposit 1000
7 usa bonds 120
8 usa deposit 300
输出:
{{1}}