好吧,这是我的问题,我想使用多义运算,以便获得3-d df。我可以使用
df = pd.concat([df1, df2], keys=('df1','df2'))
但是如何在df上添加新的df3?本质上,我想在附加模式下循环添加新的df吗?我有几千个df,在我确认它们没有效率之前,将它们全部存储起来。有没有办法做到这一点?
更具体地假设我具有以下df
df1 = pd.DataFrame(columns=['a', 'b', 'c'])
df2 = pd.DataFrame(columns=['a', 'b', 'c'])
df1.loc['index_1','b'] = 1
df1.loc['index_2','a'] = 2
df2.loc['index_7','a'] = 5
df3 = pd.DataFrame(columns=rating_matrix.columns)
df3.loc['index_9','c'] = 1
df = pd.concat([df1, df2], keys=('df1','df2'))
a b c
df1 index_1 NaN 1 NaN
index_2 2 NaN NaN
df2 index_7 5 NaN NaN
我可以用类似的方式添加df3吗?
答案 0 :(得分:0)
因此,经过一些搜索,我发现最好的方法是首先创建最终的df,重置其索引并设置最终的多索引。它看起来应该像这样:
# create df's
df1 = pd.DataFrame(columns=['a', 'b', 'c'])
df2 = pd.DataFrame(columns=['a', 'b', 'c'])
df3 = pd.DataFrame(columns=['a', 'b', 'c'])
df1.loc['index_1','b'] = 1
df1.loc['index_2','a'] = 2
df2.loc['index_7','a'] = 5
df3.loc['index_9','c'] = 1
# add index in the form of a column
df1['df'] = 'df1'
df2['df'] = 'df2'
df3['df'] = 'df3'
# reset index and set multiindex
df = pd.concat([df1, df2, df3], sort=True)
df.reset_index(inplace=True)
df.set_index(['df', 'index'], inplace=True)
df
a b c
df index
df1 index_1 NaN 1 NaN
index_2 2 NaN NaN
df2 index_7 5 NaN NaN
df3 index_9 NaN NaN 1