附加后Pandas系列.loc()访问错误

时间:2014-02-12 17:22:50

标签: pandas append series multi-index

我有一个多指数熊猫系列如下。我想向new_series添加一个新条目(multi_df),并将其命名为multi_df_appended。但是,当我尝试访问不存在的多索引时,我不理解multi_dfmulti_df_appended之间行为的变化。

以下是重现问题的代码。我想要倒数第二行代码:multi_df_appended.loc['five', 'black', 'hard', 'square' ]返回一个空系列,就像它与multi_df一样,但我得到了错误。我在这里做错了什么?

df = pd.DataFrame({'id' : range(1,9),
                    'code' : ['one', 'one', 'two', 'three',
                                'two', 'three', 'one', 'two'],
                    'colour': ['black', 'white','white','white',
                            'black', 'black', 'white', 'white'],
                    'texture': ['soft', 'soft', 'hard','soft','hard',
                                        'hard','hard','hard'],
                    'shape': ['round', 'triangular', 'triangular','triangular','square',
                                        'triangular','round','triangular']
                    },  columns= ['id','code','colour', 'texture', 'shape'])
multi_df = df.set_index(['code','colour','texture','shape']).sort_index()['id']

# try to access a non-existing multi-index combination:
multi_df.loc['five', 'black', 'hard', 'square' ]
Series([], dtype: int64) # returns an empty Series as desired/expected.

# append multi_df with a new row 
new_series = pd.Series([9], index = [('four', 'black', 'hard', 'round')] )  
multi_df_appended = multi_df.append(new_series)

# now try again to access a non-existing multi-index combination:
multi_df_appended.loc['five', 'black', 'hard', 'square' ]
error: 'MultiIndex lexsort depth 0, key was length 4' # now instead of the empty Series, I get an error!?

1 个答案:

答案 0 :(得分:2)

正如@Jeff回答的那样,如果我执行.sortlevel(0)然后针对未知索引运行.loc(),则不会给出“lexsort depth”错误:

multi_df_appended_sorted = multi_df.append(new_series).sortlevel(0)
multi_df_appended_sorted.loc['five', 'black', 'hard', 'square' ]
Series([], dtype: int64)