Question

我有一个包含2个索引的数据框，一个用于国家（20个国家），一个用于年份（10年期）。对于每个元组，我都有2列并带有数据。我已经完成了一个for循环，以增加多年来每个国家/地区的第一列中的值（并具有过去几年的累积值而不是年份的值）。 / p>

for country in data.index.get_level_values(level=0).unique():
    for year in range(2005,2015):
        data['Col1'].loc[country,year+1]=data['col1'].loc[country,year]+data['col1'].loc[country,year+1]

它很好地工作（即使必须有更优雅的方法来做到这一点）然后，我试图按col1的降序对国家排序。所以我添加了这一行：

data.loc[country]=data.loc[country].sort_values(by=['col1'])

在我的“国家”循环中。但是我得到的是我的数据框充满了NaN值。然后我尝试了不同的sort_values，sort_index，但没有任何效果。我还尝试按col1（最后一个，最大）中的兴趣值进行分组，然后尝试对分组进行分组，然后对索引进行排序。

data.groupby(data['col1'].loc[country,2014]).sort_index(level=0,ascending=False)

但是它给我带来了不错的KeyError: 5270

这是我所拥有的一个例子：

    ``````````````````````````````
                      col1 | col2
    Country |  Year |      |
    Brasil  |  2006 |   3  |
            |  2007 |   12 |
            |  2014 |   150|
``````````````````````````````````
    Germany |  2006 |   33 |
            |  2007 |   64 |
            |  2014 |   750|

我想要这个：

    ``````````````````````````````
                      col1 | col2
    Country |  Year |      |
    Germany |  2006 |   33 |
            |  2007 |   64 |
            |  2014 |   750|
``````````````````````````````````
    Brazil  |  2006 |   3  |
            |  2007 |   12 |
            |  2014 |   150|

基于德国的Col1（2014）高于巴西的事实

编辑我设法按Col1（2014）排序我的索引：

data['Col1'].loc[:,2014].sort_values(ascending=False)

给我一个系列。我现在正在寻找对整个数据框使用此索引排列的最佳方法。

使用循环对多索引数据框中的列值进行排序

0 个答案: