我试图将来自两个不同数据帧的列乘以一个新的df。第一个数据框(df1)包含不同项目的价格,列标题是日期。第二个数据帧(df2)包含每个项目的数量。
df1
Date 1990-01-03 1990-01-04 1990-01-05 ... 2020-04-09 2020-04-14 2020-04-15
AAAAAAA 1.11 1.11 1.09 ... 102.22 103.46 103.96
BBBBBBB NaN NaN NaN ... 308.70 314.95 314.10
CCCCCCC NaN NaN NaN ... 65.34 58.72 56.18
DDDDDDD 5.52 5.51 5.53 ... 104.50 106.03 NaN
EEEEEEE NaN NaN NaN ... 1211.45 1269.23 NaN
FFFFFFF NaN NaN NaN ... 36.14 36.85 NaN
GGGGGGG 93.35 94.37 94.37 ... 1564.00 1537.50 1482.50
HHHHHHH NaN NaN NaN ... 45.69 46.68 46.24
IIIIIII NaN NaN NaN ... 75.10 74.88 74.40
JJJJJJJ 328.76 328.25 327.74 ... 6168.00 6448.00 6296.00
KKKKKKK NaN NaN NaN ... 23.49 23.50 24.04
LLLLLLL 4.45 4.41 4.34 ... 36.55 35.96 NaN
MMMMMMM 1.96 1.96 1.94 ... 141.23 146.03 NaN
NNNNNNN 1.09 1.09 1.09 ... 267.99 287.05 NaN
OOOOOOO 1.09 1.09 1.08 ... 201.53 207.17 NaN
PPPPPPP NaN NaN NaN ... 98.00 100.80 100.50
QQQQQQQ NaN NaN NaN ... 129.00 128.40 124.20
RRRRRRR NaN NaN NaN ... 140.60 141.45 139.60
[18 rows x 7658 columns]
和 df2
Symbol Average Purchase Price Quantity
0 AAAAAAA 49.980 320.0
1 BBBBBBB 239.125 120.0
2 CCCCCCC 223.040 40.0
3 DDDDDDD 90.370 100.0
4 EEEEEEE 701.300 10.0
5 FFFFFFF 35.150 120.0
6 GGGGGGG 1259.000 700.0
7 HHHHHHH 32.050 250.0
8 IIIIIII 53.300 240.0
9 JJJJJJJ 6805.000 130.0
10 KKKKKKK 27.590 1000.0
11 LLLLLLL 82.120 170.0
12 MMMMMMM 106.470 150.0
13 NNNNNNN 95.970 308.0
14 OOOOOOO 81.420 150.0
15 PPPPPPP 39.690 60.0
16 QQQQQQQ 35.270 104.0
17 RRRRRRR 68.240 12.0
但是当我使用该功能时:
date = '2020-04-14'
total = df2[['Quantity']].mul(df1[date], axis=0)
print(total)
(理想情况下,我想为每个日期都这样做,但我只是在学习,所以我认为我应该从一个日期开始)
我得到:
Quantity
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN
10 NaN
11 NaN
12 NaN
13 NaN
14 NaN
15 NaN
16 NaN
17 NaN
AAAAAAA NaN
BBBBBBB NaN
CCCCCCC NaN
DDDDDDD NaN
EEEEEEE NaN
FFFFFFF NaN
GGGGGGG NaN
HHHHHHH NaN
IIIIIII NaN
JJJJJJJ NaN
KKKKKKK NaN
LLLLLLL NaN
MMMMMMM NaN
NNNNNNN NaN
OOOOOOO NaN
PPPPPPP NaN
QQQQQQQ NaN
RRRRRRR NaN
我该如何解决?
答案 0 :(得分:1)
这是索引问题。产品数据框的索引列可以证明Symbol
是第一个数据框的索引,而第二个具有顺序索引。假设在任何一个数据帧中都没有重复符号,则可以将Symbol
设置为第二个索引中的索引
date = '2020-04-14'
total = df2.set_index('Symbol')[['Quantity']].mul(df1[date], axis=0)
print(total)
它给出:
Quantity
Symbol
AAAAAAA 33107.2
BBBBBBB 37794.0
CCCCCCC 2348.8
DDDDDDD 10603.0
EEEEEEE 12692.3
FFFFFFF 4422.0
GGGGGGG 1076250.0
HHHHHHH 11670.0
IIIIIII 17971.2
JJJJJJJ 838240.0
KKKKKKK 23500.0
LLLLLLL 6113.2
MMMMMMM 21904.5
NNNNNNN 88411.4
OOOOOOO 31075.5
PPPPPPP 6048.0
QQQQQQQ 13353.6
RRRRRRR 1697.4
答案 1 :(得分:0)
问题出在索引中-您的数据帧具有不同的索引。要使代码正常工作,请使用pandas.DataFrame.reset_index()
方法统一两个数据帧中的索引。您可以使用以下代码。
>>> df1.reset_index(inplace=True)
代码会将df1
中的索引从0更改为17,这与df2
的索引相同。