对多索引的pandas数据帧进行排序

时间:2016-12-13 11:18:42

标签: python-3.x sorting pandas dataframe

我有一个DataFrame census_df,其中包含美国的县和州的数据。 DataFrame正在起诉嵌套索引,因此第一个索引代表一个州,而第二个索引代表一个县,第三个索引代表一个群体。 DataFrame如下所示。

                              CENSUS2010POP
STNAME    CTYNAME                          
Alabama   Alabama                   4779736
          Autauga County              54571
          Baldwin County             182265
          Barbour County              27457
          Bibb County                 22915
          Blount County               57322
          Bullock County              10914
          Butler County               20947
          Calhoun County             118572
          Chambers County             34215
          Cherokee County             25989
          Chilton County              43643
          Choctaw County              13859
          Clarke County               25833
          Clay County                 13932
          Cleburne County             14972
          Coffee County               49948
          Colbert County              54428
          Conecuh County              13228
          Coosa County                11539
          Covington County            37765
          Crenshaw County             13906
          Cullman County              80406
          Dale County                 50251
          Dallas County               43820
          DeKalb County               71109
          Elmore County               79303
          Escambia County             38319
          Etowah County              104430
          Fayette County              17241
...                                     ...
Wisconsin Washington County          131887
          Waukesha County            389891
          Waupaca County              52410
          Waushara County             24496
          Winnebago County           166994
          Wood County                 74749
Wyoming   Wyoming                    563626
          Albany County               36299
          Big Horn County             11668
          Campbell County             46133
          Carbon County               15885
          Converse County             13833
          Crook County                 7083
          Fremont County              40123
          Goshen County               13249
          Hot Springs County           4812
          Johnson County               8569
          Laramie County              91738
          Lincoln County              18106
          Natrona County              75450
          Niobrara County              2484
          Park County                 28205
          Platte County                8667
          Sheridan County             29116
          Sublette County             10247
          Sweetwater County           43806
          Teton County                21294
          Uinta County                21118
          Washakie County              8533
          Weston County                7208

现在,我想根据population列的值对给定状态的的第二个索引进行排序。我尝试使用

census_df = census_df.sort('CENSUS2010POP') 

但是,然后我将所有值排序:

                         CENSUS2010POP
STNAME   CTYNAME                      
Texas    Loving County              82
Hawaii   Kalawao County             90
Texas    King County               286
         Kenedy County             416
Nebraska Arthur County             460

如何根据每个州内的人口对县进行排序?任何帮助都非常感谢。

1 个答案:

答案 0 :(得分:1)

我认为您需要STNAME第一级apply,然后CENSUS2010POP功能groupbySTNAME。因此print (census_df.groupby(level=0)['CENSUS2010POP'] .apply(lambda x: x.sort_values()) .reset_index(level=0,drop=True)) STNAME CTYNAME Alabama Bullock County 10914 Choctaw County 13859 Butler County 20947 Bibb County 22915 Clarke County 25833 Cherokee County 25989 Barbour County 27457 Chambers County 34215 Chilton County 43643 Autauga County 54571 Blount County 57322 Calhoun County 118572 Baldwin County 182265 Alabama 4779736 Wisconsin Waushara County 24496 Waupaca County 52410 Wood County 74749 Washington County 131887 Winnebago County 166994 Waukesha County 389891 Wyoming Crook County 7083 Big Horn County 11668 Goshen County 13249 Converse County 13833 Carbon County 15885 Albany County 36299 Fremont County 40123 Campbell County 46133 Wyoming 563626 Name: CENSUS2010POP, dtype: int64 保持静态,仅按此列的值排序第二级:

{{1}}