我无法以有效的方式向MultiIndexed DataFrame添加单行。通过添加行,MultiIndex被展平为简单的元组索引。奇怪的是,这对于MultiIndexed列来说不是问题。
系统信息:
Python 3.6.1 |Continuum Analytics, Inc.| (default, Mar 22 2017, 19:25:17)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> pd.__version__
'0.19.2'
示例数据:包含MultiIndex行和列
的DataFrameimport numpy as np
import pandas as pd
index = pd.MultiIndex(levels=[['bar', 'foo'], ['one', 'two']],
labels=[[0, 0, 1, 1], [0, 1, 0, 1]],
names=['row_0', 'row_1'])
columns = pd.MultiIndex(levels=[['dull', 'shiny'], ['a', 'b']],
labels=[[0, 0, 1, 1], [0, 1, 0, 1]],
names=['col_0', 'col_1'])
df = pd.DataFrame(np.ones((4,4)),columns=columns, index=index)
print(df)
col_0 dull shiny
col_1 a b a b
row_0 row_1
bar one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
foo one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
向DataFrame添加其他列没有问题:
df['last_col'] = 42 #define a new column and assign a value
print(df)
col_0 dull shiny last_col
col_1 a b a b
row_0 row_1
bar one 1.0 1.0 1.0 1.0 42
two 1.0 1.0 1.0 1.0 42
foo one 1.0 1.0 1.0 1.0 42
two 1.0 1.0 1.0 1.0 42
但是,如果我为添加行(使用loc)执行相同操作,则MultiIndex将展平为 简单的元组索引:
df.loc['last_row'] = 43 #define a new row and assign a value
print(df)
col_0 dull shiny last_col
col_1 a b a b
(bar, one) 1.0 1.0 1.0 1.0 42
(bar, two) 1.0 1.0 1.0 1.0 42
(foo, one) 1.0 1.0 1.0 1.0 42
(foo, two) 1.0 1.0 1.0 1.0 42
last_row 43.0 43.0 43.0 43.0 43
有没有人知道如何以简单有效的方式添加行而不展平索引?非常感谢!!
答案 0 :(得分:2)
我认为你需要使用元组来定义MultiIndex
:
df.loc[('last_row', 'a'), :] = 43
print(df)
col_0 dull shiny
col_1 a b a b
row_0 row_1
bar one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
foo one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
last_row a 43.0 43.0 43.0 43.0
对于列,它的工作方式类似:
df[('last_col', 'a')] = 43
print(df)
col_0 dull shiny last_col
col_1 a b a b a
row_0 row_1
bar one 1.0 1.0 1.0 1.0 43
two 1.0 1.0 1.0 1.0 43
foo one 1.0 1.0 1.0 1.0 43
two 1.0 1.0 1.0 1.0 43
编辑:
似乎您需要定义列名称,如果需要全部使用:
:
df.loc['last_row',:] = 43
print(df)
col_0 dull shiny
col_1 a b a b
row_0 row_1
bar one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
foo one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
last_row 43.0 43.0 43.0 43.0
如果未定义level,则添加空字符串:
print(df.index)
MultiIndex(levels=[['bar', 'foo', 'last_row'], ['one', 'two', '']],
labels=[[0, 0, 1, 1, 2], [0, 1, 0, 1, 2]],
names=['row_0', 'row_1'])
df.loc['last_row','dull'] = 43
print(df)
col_0 dull shiny
col_1 a b a b
row_0 row_1
bar one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
foo one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
last_row 43.0 43.0 NaN NaN
df.loc['last_row', ('dull', 'a')] = 43
print(df)
col_0 dull shiny
col_1 a b a b
row_0 row_1
bar one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
foo one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
last_row 43.0 NaN NaN NaN