新的df列,当前df1值除以df2值

时间:2018-05-21 16:49:22

标签: python pandas dataframe

考虑这两个数据帧:

df1 - 全国观察到的mpgs

state   make    model   fuel    mpg
FL  honda   fit diesel  43
FL  honda   fit gas 33
FL  vw  golf    diesel  48
FL  vw  golf    gas 35
FL  ford    fiesta  diesel  40
FL  ford    fiesta  gas 36
FL  toyota  corolla diesel  44
FL  toyota  corolla gas 38

df2 -CAFE标准

make    model   fuel    mpg
honda   fit diesel  43
honda   fit gas 33
vw  golf    diesel  48
vw  golf    gas 35
ford    fiesta  diesel  40
ford    fiesta  gas 36
toyota  corolla diesel  44
toyota  corolla gas 38
nissan  sentra  diesel  39
nissan  sentra  gas 29

我想在df1 [' avg']中创建一个新列,即观察到的mpg除以生产,模型,燃料的CAFE标准。

这是我通过蛮力尝试的方法:

make_list = ['ford', 'nissan']
model_list = ['focus', 'sentra']
fuel_list = ['gas', 'diesel']

df3 = df2.loc[df2['make'].isin(make_list)]
df3 = df2.loc[df2['model'].isin(model_list)]
df3 = df2.loc[df2['fuel'].isin(fuel_list)]
goal = df3.iloc[0]['mpg']
print goal

for make in make_list:
    for model in model_list:
        for fuel in fuel_list:
            df1['avg'] = df1['mpg'] / goal

这实际上是为了比这更大的东西,但我把它们放在一起展示。 - 谢谢 - 这是我的第一篇文章/问题,所以要温柔。

1 个答案:

答案 0 :(得分:0)

这是一种方式。诀窍是设置索引来创建并将系列从一个数据帧映射到另一个数据帧。

请注意,df1df2之间的数据似乎完全对齐,因此每行的比率为1.0。

idx = ['make', 'model', 'fuel']

s = df2.set_index(idx)['mpg'].dropna()

df1['std'] = df1.set_index(idx).index.map(s.get)
df1['ratio'] = df1['mpg'] / df1['std']

print(df1)

  state    make    model    fuel  mpg  std  ratio
0    FL   honda      fit  diesel   43   43    1.0
1    FL   honda      fit     gas   33   33    1.0
2    FL      vw     golf  diesel   48   48    1.0
3    FL      vw     golf     gas   35   35    1.0
4    FL    ford   fiesta  diesel   40   40    1.0
5    FL    ford   fiesta     gas   36   36    1.0
6    FL  toyota  corolla  diesel   44   44    1.0
7    FL  toyota  corolla     gas   38   38    1.0