我有2个相同列的pandas数据帧(df1和df2),我试图将df1中的1行复制到多行df2中。 df2是一个多索引数据帧,第一个索引对应于df1的索引值,第二个索引是整数值。
以下是它们的定义方式:
df1 = pd.DataFrame(index=['one', 'two', 'three'], columns=['c1', 'c2', 'c3', 'c4'], data=np.random.random((3, 4)))
index = pd.MultiIndex.from_arrays([['one', 'one', 'two', 'two', 'two', 'three'], [0, 1, 0, 1, 2, 0]])
df2 = pd.DataFrame(index=index, columns=['c1', 'c2', 'c3', 'c4'])
他们的样子:
In : df1
Out:
c1 c2 c3 c4
one 0.158366 0.843546 0.810493 0.925164
two 0.880147 0.464835 0.416196 0.389786
three 0.138132 0.061891 0.320366 0.727997
In : df2
Out:
c1 c2 c3 c4
one 0 NaN NaN NaN NaN
1 NaN NaN NaN NaN
two 0 NaN NaN NaN NaN
1 NaN NaN NaN NaN
2 NaN NaN NaN NaN
three 0 NaN NaN NaN NaN
现在我可以设法将数据从df1复制到df2:
for index, data in df1.iterrows():
num = len(df2.loc[index])
for i in range(num):
df2.loc[(index, i)] = df1.loc[index]
结果:
In : df2
Out:
c1 c2 c3 c4
one 0 0.158366 0.843546 0.810493 0.925164
1 0.158366 0.843546 0.810493 0.925164
two 0 0.880147 0.464835 0.416196 0.389786
1 0.880147 0.464835 0.416196 0.389786
2 0.880147 0.464835 0.416196 0.389786
three 0 0.138132 0.0618906 0.320366 0.727997
知道如何更有效地做到这一点吗?
答案 0 :(得分:2)
您可以使用DataFrame.align
,在元组中返回DataFrames
,因此请为第二个添加[1]
:
np.random.seed(23)
df1 = pd.DataFrame(index=['one', 'two', 'three'], columns=['c1', 'c2', 'c3', 'c4'], data=np.random.random((3, 4)))
index = pd.MultiIndex.from_arrays([['one', 'one', 'two', 'two', 'two', 'three'], [0, 1, 0, 1, 2, 0]])
df2 = pd.DataFrame(index=index, columns=['c1', 'c2', 'c3', 'c4'])
print (df1)
c1 c2 c3 c4
one 0.517298 0.946963 0.765460 0.282396
two 0.221045 0.686222 0.167139 0.392442
three 0.618052 0.411930 0.002465 0.884032
df3 = df2.align(df1, level=0)[1]
print (df3)
c1 c2 c3 c4
one 0 0.517298 0.946963 0.765460 0.282396
1 0.517298 0.946963 0.765460 0.282396
two 0 0.221045 0.686222 0.167139 0.392442
1 0.221045 0.686222 0.167139 0.392442
2 0.221045 0.686222 0.167139 0.392442
three 0 0.618052 0.411930 0.002465 0.884032