如何从列表中创建熊猫数据框

时间:2018-01-27 06:27:32

标签: python pandas matplotlib

我想从两个数据帧可视化veh_class列的cmb_mpg比例。我要求解决如何从我想要绘制的结果列表中创建新数据帧。

我试图绘制列表,但是我得到了一个AttributeError:'list'对象没有属性'plot'

如果有方法的话,我会选择绘制列表的条形图。

prop = []
vehicle_classes = df_18["veh_class"].unique()

for v_class in vehicle_classes:
    cmb_mpg_08 = df_08[df_08['veh_class'] == v_class]['cmb_mpg'].mean()
    cmb_mpg_18 = df_18[df_18['veh_class'] == v_class]['cmb_mpg'].mean()
    proportion = cmb_mpg_18 / cmb_mpg_08
    prop.append("{}: {}".format(v_class, proportion))

prop

这是上述代码块的输出结果。

['small SUV: nan',
 'small car: 1.204497798280562',
 'midsize car: 1.290841999329084',
 'large car: 1.2647347740667978',
 'standard SUV: nan',
 'station wagon: 1.2308231787498904',
 'pickup: 1.1420789918199243',
 'special purpose: nan',
 'minivan: 1.088']

1 个答案:

答案 0 :(得分:0)

我相信你需要:

df_18 = pd.DataFrame({'veh_class':['small SUV','small SUV','large car'],
                      'cmb_mpg':[1,2,3]})

df_08 = pd.DataFrame({'veh_class':['small SUV','small SUV',
                                   'large car', 'another car'],
                      'cmb_mpg':[3,4,8,5]})

在pandas中是最好的避免循环,所以使用groupby和聚合mean,并且对于两个输出中的相同索引添加reindex

cmb_mpg_18 = df_18.groupby("veh_class")['cmb_mpg'].mean()
cmb_mpg_08 = df_08.groupby("veh_class")['cmb_mpg'].mean().reindex(cmb_mpg_18.index)

print (cmb_mpg_08)
veh_class
large car    8.0
small SUV    3.5
Name: cmb_mpg, dtype: float64

print (cmb_mpg_18)
veh_class
large car    3.0
small SUV    1.5
Name: cmb_mpg, dtype: float64

然后除以divrename除以index列的其他列名称和最后reset_index列:

proportion = cmb_mpg_18.div(cmb_mpg_08).rename('prop').reset_index()
print (proportion)
   veh_class      prop
0  large car  0.375000
1  small SUV  0.428571

bar情节的最后一次致电DataFrame.plot.bar或线图的DataFrame.plot

proportion.plot.bar(x='veh_class', y='prop')

proportion.plot(x='veh_class', y='prop')