我想从两个数据帧可视化veh_class列的cmb_mpg比例。我要求解决如何从我想要绘制的结果列表中创建新数据帧。
我试图绘制列表,但是我得到了一个AttributeError:'list'对象没有属性'plot'
如果有方法的话,我会选择绘制列表的条形图。
prop = []
vehicle_classes = df_18["veh_class"].unique()
for v_class in vehicle_classes:
cmb_mpg_08 = df_08[df_08['veh_class'] == v_class]['cmb_mpg'].mean()
cmb_mpg_18 = df_18[df_18['veh_class'] == v_class]['cmb_mpg'].mean()
proportion = cmb_mpg_18 / cmb_mpg_08
prop.append("{}: {}".format(v_class, proportion))
prop
这是上述代码块的输出结果。
['small SUV: nan',
'small car: 1.204497798280562',
'midsize car: 1.290841999329084',
'large car: 1.2647347740667978',
'standard SUV: nan',
'station wagon: 1.2308231787498904',
'pickup: 1.1420789918199243',
'special purpose: nan',
'minivan: 1.088']
答案 0 :(得分:0)
我相信你需要:
df_18 = pd.DataFrame({'veh_class':['small SUV','small SUV','large car'],
'cmb_mpg':[1,2,3]})
df_08 = pd.DataFrame({'veh_class':['small SUV','small SUV',
'large car', 'another car'],
'cmb_mpg':[3,4,8,5]})
在pandas中是最好的避免循环,所以使用groupby
和聚合mean
,并且对于两个输出中的相同索引添加reindex
。
cmb_mpg_18 = df_18.groupby("veh_class")['cmb_mpg'].mean()
cmb_mpg_08 = df_08.groupby("veh_class")['cmb_mpg'].mean().reindex(cmb_mpg_18.index)
print (cmb_mpg_08)
veh_class
large car 8.0
small SUV 3.5
Name: cmb_mpg, dtype: float64
print (cmb_mpg_18)
veh_class
large car 3.0
small SUV 1.5
Name: cmb_mpg, dtype: float64
然后除以div
,rename
除以index
列的其他列名称和最后reset_index
列:
proportion = cmb_mpg_18.div(cmb_mpg_08).rename('prop').reset_index()
print (proportion)
veh_class prop
0 large car 0.375000
1 small SUV 0.428571
bar
情节的最后一次致电DataFrame.plot.bar
或线图的DataFrame.plot
:
proportion.plot.bar(x='veh_class', y='prop')
proportion.plot(x='veh_class', y='prop')