在按其中一个级别排序后,如何重新编号MultiIndex级别?这是排序后的DataFrame:
+--------+---+------+
| | | text |
+--------+---+------+
| letter | | |
+--------+---+------+
| a | 0 | blah |
+--------+---+------+
| | 3 | blah |
+--------+---+------+
| | 6 | blah |
+--------+---+------+
| b | 1 | blah |
+--------+---+------+
| | 4 | blah |
+--------+---+------+
| | 7 | blah |
+--------+---+------+
| c | 2 | blah |
+--------+---+------+
| | 5 | blah |
+--------+---+------+
| | 8 | blah |
+--------+---+------+
这就是我想要的(但可能将原始索引留在自己的专栏中):
+--------+---+------+
| | | text |
+--------+---+------+
| letter | | |
+--------+---+------+
| a | 0 | blah |
+--------+---+------+
| | 1 | blah |
+--------+---+------+
| | 2 | blah |
+--------+---+------+
| b | 0 | blah |
+--------+---+------+
| | 1 | blah |
+--------+---+------+
| | 2 | blah |
+--------+---+------+
| c | 0 | blah |
+--------+---+------+
| | 1 | blah |
+--------+---+------+
| | 2 | blah |
+--------+---+------+
我试过寻找答案,尝试编写不同的东西,但我很难过。
重现上面第一个表的代码:
import pandas as pd
df = pd.DataFrame({'letter': ['a', 'b', 'c'] * 3, 'text': ['blah'] * 9})
df.set_index(keys='letter', append=True, inplace=True)
df = df.reorder_levels(order=[1, 0])
df.sort_index(level=0, inplace=True)
print(df)
答案 0 :(得分:2)
您可以查看cumcount
df=df.assign(yourindex=df.groupby('letter').cumcount()).set_index(['letter','yourindex']).sort_index(level=[0,1])
df
Out[861]:
text
letter yourindex
a 0 blah
1 blah
2 blah
b 0 blah
1 blah
2 blah
c 0 blah
1 blah
2 blah
答案 1 :(得分:1)
这就是我的所作所为:
df["new_index"] = df.groupby("letter").cumcount()
df
这会给你:
text new_index
letter
a 0 blah 0
3 blah 1
6 blah 2
b 1 blah 0
4 blah 1
7 blah 2
c 2 blah 0
5 blah 1
8 blah 2
然后,您可以重置索引:
df.reset_index().set_index(["letter","new_index"])
level_1 text
letter new_index
a 0 0 blah
1 3 blah
2 6 blah
b 0 1 blah
1 4 blah
2 7 blah
c 0 2 blah
1 5 blah
2 8 blah