考虑这两个数据帧:
df1 - 全国观察到的mpgs
state make model fuel mpg
FL honda fit diesel 43
FL honda fit gas 33
FL vw golf diesel 48
FL vw golf gas 35
FL ford fiesta diesel 40
FL ford fiesta gas 36
FL toyota corolla diesel 44
FL toyota corolla gas 38
df2 -CAFE标准
make model fuel mpg
honda fit diesel 43
honda fit gas 33
vw golf diesel 48
vw golf gas 35
ford fiesta diesel 40
ford fiesta gas 36
toyota corolla diesel 44
toyota corolla gas 38
nissan sentra diesel 39
nissan sentra gas 29
我想在df1 [' avg']中创建一个新列,即观察到的mpg除以生产,模型,燃料的CAFE标准。
这是我通过蛮力尝试的方法:
make_list = ['ford', 'nissan']
model_list = ['focus', 'sentra']
fuel_list = ['gas', 'diesel']
df3 = df2.loc[df2['make'].isin(make_list)]
df3 = df2.loc[df2['model'].isin(model_list)]
df3 = df2.loc[df2['fuel'].isin(fuel_list)]
goal = df3.iloc[0]['mpg']
print goal
for make in make_list:
for model in model_list:
for fuel in fuel_list:
df1['avg'] = df1['mpg'] / goal
这实际上是为了比这更大的东西,但我把它们放在一起展示。 - 谢谢 - 这是我的第一篇文章/问题,所以要温柔。
答案 0 :(得分:0)
这是一种方式。诀窍是设置索引来创建并将系列从一个数据帧映射到另一个数据帧。
请注意,df1
和df2
之间的数据似乎完全对齐,因此每行的比率为1.0。
idx = ['make', 'model', 'fuel']
s = df2.set_index(idx)['mpg'].dropna()
df1['std'] = df1.set_index(idx).index.map(s.get)
df1['ratio'] = df1['mpg'] / df1['std']
print(df1)
state make model fuel mpg std ratio
0 FL honda fit diesel 43 43 1.0
1 FL honda fit gas 33 33 1.0
2 FL vw golf diesel 48 48 1.0
3 FL vw golf gas 35 35 1.0
4 FL ford fiesta diesel 40 40 1.0
5 FL ford fiesta gas 36 36 1.0
6 FL toyota corolla diesel 44 44 1.0
7 FL toyota corolla gas 38 38 1.0