在追加模式下使用multiindex的concat df

时间:2018-12-07 14:49:54

标签: pandas

好吧,这是我的问题,我想使用多义运算,以便获得3-d df。我可以使用

df = pd.concat([df1, df2], keys=('df1','df2'))

但是如何在df上添加新的df3?本质上,我想在附加模式下循环添加新的df吗?我有几千个df,在我确认它们没有效率之前,将它们全部存储起来。有没有办法做到这一点?

更具体地假设我具有以下df

df1 = pd.DataFrame(columns=['a', 'b', 'c'])
df2 = pd.DataFrame(columns=['a', 'b', 'c'])
df1.loc['index_1','b'] = 1
df1.loc['index_2','a'] = 2

df2.loc['index_7','a'] = 5
df3 = pd.DataFrame(columns=rating_matrix.columns)
df3.loc['index_9','c'] = 1

df = pd.concat([df1, df2], keys=('df1','df2'))


    a   b   c
df1     index_1     NaN     1   NaN
        index_2     2   NaN     NaN
df2     index_7     5   NaN     NaN

我可以用类似的方式添加df3吗?

1 个答案:

答案 0 :(得分:0)

因此,经过一些搜索,我发现最好的方法是首先创建最终的df,重置其索引并设置最终的多索引。它看起来应该像这样:

# create df's
df1 = pd.DataFrame(columns=['a', 'b', 'c'])
df2 = pd.DataFrame(columns=['a', 'b', 'c'])
df3 = pd.DataFrame(columns=['a', 'b', 'c'])

df1.loc['index_1','b'] = 1
df1.loc['index_2','a'] = 2
df2.loc['index_7','a'] = 5
df3.loc['index_9','c'] = 1

# add index in the form of a column
df1['df'] = 'df1' 
df2['df'] = 'df2'
df3['df'] = 'df3'

# reset index and set multiindex
df = pd.concat([df1, df2, df3], sort=True)
df.reset_index(inplace=True)
df.set_index(['df', 'index'], inplace=True)
df



                         a       b       c
df  index           
df1         index_1     NaN      1      NaN
            index_2      2      NaN     NaN
df2         index_7      5      NaN     NaN
df3         index_9     NaN     NaN      1