如何将两个二维数据框合并为一个多维索引的多维熊猫数据框?

时间:2018-12-14 14:32:39

标签: python pandas dataframe

我有两个相同大小的数据帧,如下所示:

cost_type1 = pd.DataFrame([[1,2,3,4], [100,200,300,400]]).transpose()
cost_type2 = pd.DataFrame([[1,4,9,25], [10,40,90,250]]).transpose()

由于这些数据框均与成本相关,因此我希望将它们合并为一个结构,以便可以说出cost [i]之类的信息,并获取类型i的成本矩阵。

我尝试如下使用多索引:

timestamps =["2014-01-01", "2014-02-01"]
categories = ["A", "B","C","D"]
idx = pd.MultiIndex.from_product([timestamps,categories], names=["ts", 
"cat"])
df = pd.DataFrame(index=idx, columns=["col1", "col2"])

我得到一个很好的空白数据框,如下所示:(out)

               col1 col2
ts         cat          
2014-01-01 A    NaN  NaN
           B    NaN  NaN
           C    NaN  NaN
           D    NaN  NaN
2014-02-01 A    NaN  NaN
           B    NaN  NaN
           C    NaN  NaN
           D    NaN  NaN

但是,我无法用已有的两个“较小”数据填充“较大”数据框。我尝试过类似的方法,但没有成功:

df.loc["2014-01-01",:] = newdf1
df.loc["2014-02-01",:] = newdf2

你们谁都知道如何解决这个问题?谢谢!

1 个答案:

答案 0 :(得分:1)

使用concat为每个DataFrame创建新索引,因此不需要空的DataFrame:

timestamps = ["2014-01-01", "2014-02-01"]
categories = ["A", "B","C","D"]
idx = pd.MultiIndex.from_product([timestamps,categories], names=["ts", "cat"])

df = pd.concat([cost_type1.set_index([categories]), 
                cost_type2.set_index([categories])], keys=timestamps)
df.columns=["col1", "col2"]
df.index.names=['ts','cat']

如果输入为list of DataFrames,请使用列表理解:

dfs = [cost_type1, cost_type2]
df = pd.concat([x.set_index([categories]) for x in dfs], keys=timestamps)
df.columns=["col1", "col2"]
df.index.names=['ts','cat']
print (df)
                col1  col2
ts         cat            
2014-01-01 A       1   100
           B       2   200
           C       3   300
           D       4   400
2014-02-01 A       1    10
           B       4    40
           C       9    90
           D      25   250