如何将具有单级列的现有数据框转换为具有分层 index 列(MultiIndex)?
示例数据框:
In [1]:
import pandas as pd
from pandas import Series, DataFrame
df = DataFrame(np.arange(6).reshape((2,3)),
index=['A','B'],
columns=['one','two','three'])
df
Out [1]:
one two three
A 0 1 2
B 3 4 5
我认为reindex()会起作用,但我得到了NaN:
In [2]:
df.reindex(columns=[['odd','even','odd'],df.columns])
Out [2]:
odd even odd
one two three
A NaN NaN NaN
B NaN NaN NaN
如果我使用DataFrame():
In [3]:
DataFrame(df,columns=[['odd','even','odd'],df.columns])
Out [3]:
odd even odd
one two three
A NaN NaN NaN
B NaN NaN NaN
如果我指定df.values,那么最后一种方法确实有效:
In [4]:
DataFrame(df.values,index=df.index,columns=[['odd','even','odd'],df.columns])
Out [4]:
odd even odd
one two three
A 0 1 2
B 3 4 5
这样做的正确方法是什么?为什么reindex()给NaN?
答案 0 :(得分:16)
你很接近,只是将列直接设置为一个新的(大小相同)索引(如果它的列表列表将转换为多索引)
In [8]: df
Out[8]:
one two three
A 0 1 2
B 3 4 5
In [10]: df.columns = [['odd','even','odd'],df.columns]
In [11]: df
Out[11]:
odd even odd
one two three
A 0 1 2
B 3 4 5
Reindex将重新排序/过滤现有索引。你得到所有nans的原因是你说,他们找到与这个新索引相匹配的现有列;没有匹配,所以这就是你得到的