我想根据下面的数据框,为每个州获得3个具有最高“提高”价值的城市。
因此,每个城市都有许多竞选活动。我想对每个城市的“筹集”金额进行汇总,以获取每个城市的筹集总额,然后按照每个州的“筹集”金额来显示三个排名靠前的城市。
category city created goal name raised state url
0 Medical OXNARD December 30, 2018 15000.0 Wayne's Cancer Care Fund 80.0 CA https://www.gofundme.com/8qz8h6-waynes-cancer-...
1 Medical CHINO HILLS December 4, 2018 2000.0 Mother of two has cancer and needs help 500.0 CA https://www.gofundme.com/3qi0rog
2 Medical BATTLE CREEK December 6, 2018 10000.0 Hospital costs 570.0 MI https://www.gofundme.com/3sbwals
3 Medical FEASTERVILLE TREVOSE December 3, 2018 10000.0 Help raise Joey & Brianna 2200.0 MI https://www.gofundme.com/get-away-from-him
4 Medical WEST PALM BEACH December 12, 2018 6000.0 Kelvin McCray Recovery Fund 2450.0 MI https://www.gofundme.com/send-ricky-to-school
5 Medical JONES December 11, 2018 25000.0 Wheelchair Accessible Vehicle for Taelor 2270.0 OK https://www.gofundme.com/HelpTaelorTransport
6 Medical CONROE December 20, 2018 10000.0 "A Good friend in dire need" 1250.0 OK https://www.gofundme.com/4dmeoko
json中的采样日期:
[{
"category": "Medical",
"city": "OXNARD",
"created": "December 30, 2018",
"goal": 15000.0,
"name": "Wayne's Cancer Care Fund",
"raised": 80.0,
"state": "CA",
"url": "https://www.gofundme.com/8qz8h6-waynes-cancer-care-fund"
},
{
"category": "Medical",
"city": "CHINO HILLS",
"created": "December 4, 2018",
"goal": 2000.0,
"name": "Mother of two has cancer and needs help",
"raised": 500.0,
"state": "CA",
"url": "https://www.gofundme.com/3qi0rog"
},
{
"category": "Medical",
"city": "BATTLE CREEK",
"created": "December 6, 2018",
"goal": 10000.0,
"name": "Hospital costs",
"raised": 570.0,
"state": "MI",
"url": "https://www.gofundme.com/3sbwals"
},
{
"category": "Medical",
"city": "FEASTERVILLE TREVOSE",
"created": "December 3, 2018",
"goal": 10000.0,
"name": "Help raise Joey & Brianna",
"raised": 2200.0,
"state": "MI",
"url": "https://www.gofundme.com/get-away-from-him"
},
{
"category": "Medical",
"city": "WEST PALM BEACH",
"created": "December 12, 2018",
"goal": 6000.0,
"name": "Kelvin McCray Recovery Fund",
"raised": 2450.0,
"state": "MI",
"url": "https://www.gofundme.com/send-ricky-to-school"
},
{
"category": "Medical",
"city": "JONES",
"created": "December 11, 2018",
"goal": 25000.0,
"name": "Wheelchair Accessible Vehicle for Taelor",
"raised": 2270.0,
"state": "OK",
"url": "https://www.gofundme.com/HelpTaelorTransport"
},
{
"category": "Medical",
"city": "CONROE",
"created": "December 20, 2018",
"goal": 10000.0,
"name": "\"A Good friend in dire need\"",
"raised": 1250.0,
"state": "OK",
"url": "https://www.gofundme.com/4dmeoko"
}]
预期结果应如下所示:
123 State1 City1 100
3 City2 99
58 City3 98
8 State2 City4 97
12 City5 96
1 City6 95
这并没有真正的帮助:
maxRaisedCityByState = a.df.groupby(['state','city'])['raised'].max()
据称duplicate question的回答没有帮助:
答案 0 :(得分:1)
我简化了城市和州名,以使其更容易理解。
允许我尝试一种新的表格式解决方案:)
<table><tbody><tr><th>category</th><th>city</th><th>created</th><th>goal</th><th>name</th><th>raised</th><th>state</th><th>url</th><th> </th></tr><tr><td>0</td><td>Medical</td><td>City1</td><td>December 30, 2018</td><td>15000.0</td><td>Wayne's Cancer Care Fund</td><td>80.0</td><td>State1</td><td>https://www.gofundme.com/8qz8h6-waynes-cancer-...</td></tr><tr><td>1</td><td>Medical</td><td>City1</td><td>December 4, 2018</td><td>2000.0</td><td>Mother of two has cancer and needs help</td><td>500.0</td><td>State1</td><td>https://www.gofundme.com/3qi0rog</td></tr><tr><td>2</td><td>Medical</td><td>City2</td><td>December 6, 2018</td><td>10000.0</td><td>Hospital costs</td><td>570.0</td><td>State1</td><td>https://www.gofundme.com/3sbwals</td></tr><tr><td>3</td><td>Medical</td><td>City3</td><td>December 3, 2018</td><td>10000.0</td><td>Help raise Joey & Brianna</td><td>2200.0</td><td>State1</td><td>https://www.gofundme.com/get-away-from-him</td></tr><tr><td>4</td><td>Medical</td><td>City4</td><td>December 12, 2018</td><td>6000.0</td><td>Kelvin McCray Recovery Fund</td><td>2450.0</td><td>State2</td><td>https://www.gofundme.com/send-ricky-to-school</td></tr><tr><td>5</td><td>Medical</td><td>City5</td><td>December 11, 2018</td><td>25000.0</td><td>Wheelchair Accessible Vehicle for Taelor</td><td>2270.0</td><td>State2</td><td>https://www.gofundme.com/HelpTaelorTransport</td></tr><tr><td>6</td><td>Medical</td><td>City6</td><td>December 20, 2018</td><td>10000.0</td><td>"A Good friend in dire need"</td><td>1250.0</td><td>State2</td><td>https://www.gofundme.com/4dmeoko</td></tr></tbody></table>
第1步:删除不需要的列:
df.drop(['category', 'created', 'goal', 'name', 'url'], inplace=True, axis = 1)
哪个给了我们
city raised state
0 City1 80.0 State1
1 City1 500.0 State1
2 City2 570.0 State1
3 City3 2200.0 State1
4 City4 2450.0 State2
5 City5 2270.0 State2
6 City6 1250.0 State2
第2步:按州和城市分组并汇总其筹集的金额(在此示例中,仅是city1):
df = df.groupby(['state', 'city']).sum()
现在我们有:
raised
state city
State1 City1 580.0
City2 570.0
City3 2200.0
State2 City4 2450.0
City5 2270.0
City6 1250.0
请注意,我们的索引已从数字级变为多层级。 1级是州,2级是城市。
第3步:按筹集金额排序。正如预期的那样,它忽略了多索引顺序,因此在排序之后,我们需要对索引进行重新排序。我们仅按状态0进行排序。 :
df.sort_values('raised', ascending=False).sort_index(level=[0], sort_remaining=False).groupby('state').head(3)
最后我们有:
raised
state city
State1 City3 2200.0
City1 580.0
City2 570.0
State2 City4 2450.0
City5 2270.0
City6 1250.0