如何将来自单独数据帧的两个熊猫列相乘?

时间:2020-08-17 16:44:52

标签: pandas

我有两个熊猫数据框,一个是查询表,另一个是“主”表。

查找表是这样。

import pandas as pd
lu_dict = {'state': ['OH', 'TX', 'IA', 'WY', 'KS'], 'fire_pct':[0.542630,.174425,0.206752,0.004621,0.441946]
          , 'hail_pct':[0.008787,0.440272,0.422005,0.434709,0.312338]
          ,'tw_pct':[0.101449,0.179536,0.159886,0.028349,0.151416]
          ,'other_pct':[0.224980,0.160096,0.149560,0.393357,0.036523]
          ,'wp_pct':[0.122154,0.045671,0.061796,0.138963,0.057777]}
lu = pd.DataFrame(lu_dict)

主表如下:

preds_dict = {'state':['OH', 'TX', 'IA', 'WY', 'KS'],
             'fire_preds':[.01,.02,.03,.015,.66]
          , 'hail_preds':[.03,.005,.12,.23,.006]
          ,'tw_preds':[.001,.02,.0035,.04,.02]
          ,'other_preds':[.003,.05,.001,.01,.06]
          ,'wp_preds':[.002,.03,.005,.01,.04]}

preds = pd.DataFrame(preds_dict)

我需要在“主”表中的观察值与查找表中的state列匹配,然后将查找表中的fire_pct与“主”中的“ fire_preds”相乘表,“ other_pct”是“ other_preds”,“ wp_pct”是“ wp_preds”,等等。

如果字典对于查找表更好地工作,那很好。我只需要将主表保留为当前数据帧形式以进行进一步处理。

最后,我要查找的输出是一列中那些乘法输出的总和。

1 个答案:

答案 0 :(得分:1)

IIUC,您需要重新命名以使熊猫正确对齐数据。

mults = (lu.rename(columns=dict(zip(lu.columns, preds.columns))).set_index('state') * 
         preds.set_index('state'))
print(mults)

输出:

       fire_preds  hail_preds  tw_preds  other_preds  wp_preds
state                                                         
OH       0.005426    0.000264  0.000101     0.000675  0.000244
TX       0.003488    0.002201  0.003591     0.008005  0.001370
IA       0.006203    0.050641  0.000560     0.000150  0.000309
WY       0.000069    0.099983  0.001134     0.003934  0.001390
KS       0.291684    0.001874  0.003028     0.002191  0.002311

总和:

mults.sum()

fire_preds     0.306871
hail_preds     0.154963
tw_preds       0.008414
other_preds    0.014954
wp_preds       0.005624
dtype: float64

按州求和:

mults.sum(axis=1)

state
OH    0.006711
TX    0.018656
IA    0.057861
WY    0.106510
KS    0.301089
dtype: float64