DataFrame算法始终对齐索引和列名称。如果我有两个具有相同列数但列名不同的dfs,似乎我不能在它们之间进行算术运算:
Out[1]:
length = pd.DataFrame(data=np.random.normal(size=[5,2]),index=range(5),columns=['length1','length2'])
length
Out[2]:
length1 length2
0 -0.430872 1.087211
1 -0.788218 -0.440801
2 -0.540136 -1.217191
3 -0.561248 0.305545
4 0.158832 0.075283
height = pd.DataFrame(data=np.random.normal(size=[5,2]),index=range(1,6),columns=['height1','height2'])
height
Out[3]:
height1 height2
1 -1.105751 1.089808
2 -0.360827 -0.803927
3 0.454469 -0.766144
4 0.476534 -0.855870
5 -0.007049 0.038307
length*height
Out[4]:
height1 height2 length1 length2
0 NaN NaN NaN NaN
1 NaN NaN NaN NaN
2 NaN NaN NaN NaN
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
5 NaN NaN NaN NaN
这可能是一种安全措施,可确保您仅对预期数据进行操作。但我仍然想知道有没有办法可以在两个DataFrame(具有相同列数)之间执行操作,但只能在索引轴上对齐?
编辑:原始示例过于简化,因为两个df具有相同的索引[0,1,2,3,4]。我将第二个df的索引移动了1,以使其成为一个更好的例子。
答案 0 :(得分:0)
ans=pd.DataFrame(length.values * height.values)
将其转换为numpy数组并按照
进行乘法运算 0 1
0 0.396724 -0.264562
1 -0.460419 -0.285086
2 0.126083 -0.494675
3 -0.272121 0.305155
4 -0.159292 0.444439
答案 1 :(得分:0)
了解user3589054的用途,我认为此代码可能适合您:
height.multiply(length.values, axis = 0)
这是我的输出:
>>> length = pd.DataFrame(data=np.random.normal(size=[5,2]),index=range(5),columns=['length1','length2'])
>>> height = pd.DataFrame(data=np.random.normal(size=[5,2]),index=range(5),columns=['height1','height2'])
>>> length
length1 length2
0 1.000865 -0.758316
1 0.285942 -2.000440
2 -0.399625 0.686547
3 0.809561 1.238211
4 2.216696 -1.347227
>>> height
height1 height2
0 0.505477 -0.299634
1 -0.234154 -2.490459
2 -0.134534 1.063768
3 0.010025 0.435895
4 2.290053 -0.096494
>>> height.multiply(length.values, axis = 0)
height1 height2
0 0.505915 0.227217
1 -0.066954 4.982013
2 0.053763 0.730326
3 0.008116 0.539730
4 5.076352 0.129999