我设法创建了一个图表,显示了我的Pandas数据框中每个年龄段的每个类标签的记录数。但我也希望看到每个年龄组中“非功能”课程的百分比标签。
图表的Python代码是
train['age_wpt'] = train.date_recorded.str.split('-').str.get(0).apply(int) - train.construction_year
figure = plt.figure(figsize=(15,8))
plt.hist([
train[(train.status_group=='functional') & (train.age_wpt < 60.0) & (train.age_wpt >= 0.0)]['age_wpt'],
train[(train.status_group=='non functional') & (train.age_wpt < 60.0) & (train.age_wpt >= 0.0)]['age_wpt'],
train[(train.status_group=='functional needs repair') & (train.age_wpt < 60.0) & (train.age_wpt >= 0.0)]['age_wpt']
],
stacked=True, color = ['b','r','y'],
bins = 30,label = ['functional','non functional', 'functional needs repair'])
plt.xlabel('Age')
plt.ylabel('Number of records')
plt.legend()
答案 0 :(得分:0)
normed:布尔值,可选 如果
True
,返回元组的第一个元素将 是规范化以形成概率密度的计数,即n/(len(x)`dbin)
,即直方图的积分将求和 如果堆叠也是 True ,则直方图的总和为 归一化为1。 默认值为False
plt.hist([
train[(train.status_group=='functional') & (train.age_wpt < 60.0) & (train.age_wpt >= 0.0)]['age_wpt'],
train[(train.status_group=='non functional') & (train.age_wpt < 60.0) & (train.age_wpt >= 0.0)]['age_wpt'],
train[(train.status_group=='functional needs repair') & (train.age_wpt < 60.0) & (train.age_wpt >= 0.0)]['age_wpt']
],
stacked=False, color = ['b','r','y'], normed=True
bins = 30,label = ['functional','non functional', 'functional needs repair'])