我有两个相同大小的数据帧,如下所示:
cost_type1 = pd.DataFrame([[1,2,3,4], [100,200,300,400]]).transpose()
cost_type2 = pd.DataFrame([[1,4,9,25], [10,40,90,250]]).transpose()
由于这些数据框均与成本相关,因此我希望将它们合并为一个结构,以便可以说出cost [i]之类的信息,并获取类型i的成本矩阵。
我尝试如下使用多索引:
timestamps =["2014-01-01", "2014-02-01"]
categories = ["A", "B","C","D"]
idx = pd.MultiIndex.from_product([timestamps,categories], names=["ts",
"cat"])
df = pd.DataFrame(index=idx, columns=["col1", "col2"])
我得到一个很好的空白数据框,如下所示:(out)
col1 col2
ts cat
2014-01-01 A NaN NaN
B NaN NaN
C NaN NaN
D NaN NaN
2014-02-01 A NaN NaN
B NaN NaN
C NaN NaN
D NaN NaN
但是,我无法用已有的两个“较小”数据填充“较大”数据框。我尝试过类似的方法,但没有成功:
df.loc["2014-01-01",:] = newdf1
df.loc["2014-02-01",:] = newdf2
你们谁都知道如何解决这个问题?谢谢!
答案 0 :(得分:1)
使用concat
为每个DataFrame创建新索引,因此不需要空的DataFrame:
timestamps = ["2014-01-01", "2014-02-01"]
categories = ["A", "B","C","D"]
idx = pd.MultiIndex.from_product([timestamps,categories], names=["ts", "cat"])
df = pd.concat([cost_type1.set_index([categories]),
cost_type2.set_index([categories])], keys=timestamps)
df.columns=["col1", "col2"]
df.index.names=['ts','cat']
如果输入为list of DataFrames
,请使用列表理解:
dfs = [cost_type1, cost_type2]
df = pd.concat([x.set_index([categories]) for x in dfs], keys=timestamps)
df.columns=["col1", "col2"]
df.index.names=['ts','cat']
print (df)
col1 col2
ts cat
2014-01-01 A 1 100
B 2 200
C 3 300
D 4 400
2014-02-01 A 1 10
B 4 40
C 9 90
D 25 250