保留原始列的层次结构索引按列排序

时间:2018-10-02 17:07:50

标签: python-3.x pandas

我正在尝试如下模拟Hirarical Index dataFrame:

setTimeout(function() { 
    var vid = document.getElementById("myVideo");
    vid.play(); 
}, 1000);

下面是DataFrame的外观,所以datFrame带有默认的索引。

>>> raw_data = ({'city': ['Delhi', 'Kanpur', 'Mumbai', 'Pune','Delhi', 'Kanpur', 'Mumbai', 'Pune'],
...                 'rank': ['1st', '2nd', '1st', '2nd','1st', '2nd', '1st', '2nd'],
...                 'name': ['Ramesh', 'Kirpal', 'Jungi', 'Sanju','Ramesh', 'Kirpal', 'Jungi', 'Sanju'],
...                 'score1': [10,15,20,25,10,15,20,25],
...                 'score2': [20,35,40,45,20,35,40,45]})

我想通过使用>>> df = pd.DataFrame(raw_data, columns = ['city', 'rank', 'name', 'score1', 'score2']) >>> df city rank name score1 score2 0 Delhi 1st Ramesh 10 20 1 Kanpur 2nd Kirpal 15 35 2 Mumbai 1st Jungi 20 40 3 Pune 2nd Sanju 25 45 4 Delhi 1st Ramesh 10 20 5 Kanpur 2nd Kirpal 15 35 6 Mumbai 1st Jungi 20 40 7 Pune 2nd Sanju 25 45 方法选择'city', 'rank'列来使用层次索引,同时保持原始列不变。

set.index

但是我希望首先使用>>> df.set_index(['city', 'rank'], drop=False) city rank name score1 score2 city rank Delhi 1st Delhi 1st Ramesh 10 20 Kanpur 2nd Kanpur 2nd Kirpal 15 35 Mumbai 1st Mumbai 1st Jungi 20 40 Pune 2nd Pune 2nd Sanju 25 45 Delhi 1st Delhi 1st Ramesh 10 20 Kanpur 2nd Kanpur 2nd Kirpal 15 35 Mumbai 1st Mumbai 1st Jungi 20 40 Pune 2nd Pune 2nd Sanju 25 45 索引,然后再使用city索引:

rank

1 个答案:

答案 0 :(得分:2)

您快到了,只需要申请sort_index()

df.set_index(['city','rank'], drop=False).sort_index()

收益:

               city rank    name  score1  score2
city   rank                                     
Delhi  1st    Delhi  1st  Ramesh      10      20
       1st    Delhi  1st  Ramesh      10      20
Kanpur 2nd   Kanpur  2nd  Kirpal      15      35
       2nd   Kanpur  2nd  Kirpal      15      35
Mumbai 1st   Mumbai  1st   Jungi      20      40
       1st   Mumbai  1st   Jungi      20      40
Pune   2nd     Pune  2nd   Sanju      25      45
       2nd     Pune  2nd   Sanju      25      45

要删除重复的行,请添加drop_duplicates()

df.set_index(['city','rank'], drop=False).sort_index().drop_duplicates()

收益:

               city rank    name  score1  score2
city   rank                                     
Delhi  1st    Delhi  1st  Ramesh      10      20
Kanpur 2nd   Kanpur  2nd  Kirpal      15      35
Mumbai 1st   Mumbai  1st   Jungi      20      40
Pune   2nd     Pune  2nd   Sanju      25      45