堆积条形图-百分比

时间:2019-12-22 10:29:43

标签: pandas stacked-chart

我想用百分比的堆叠条形图表示此信息 在x轴上我想要年龄组,在y轴上我想要代表每个年龄组中性别百分比的值 年龄由数据集中的垃圾箱表示 我到目前为止 enter image description here

这是我的代码:

c = ds.groupby(['Age','Gender'])['Gender'].count()

d=(((c /c.groupby(level=0).sum())*100).round()).astype('int64')

d

1 个答案:

答案 0 :(得分:0)

我创建了一个测试数据框:

 df = pd.DataFrame({'Gender': ['F','M','F','F','F','M','M','M','F','F','M','F','F','M','M','M','M','F','F','M','M','M'], 'Age': [17,10,20,51,53,15,50,60,43,28,35,67,33,17,20,40,43,47,48,51,53,54]})

您可以使用pandas.cut函数将年龄划分为适当的时间间隔:

bins = pd.IntervalIndex.from_tuples([(0,17),(17,25),(25,35),(35,46),(46,50),(50,55), (55,np.inf)])
df['Age_interval'] = pd.cut(df['Age'], bins=bins)
df = df.groupby(['Age_interval', 'Gender']).size().unstack().fillna(0)

df['F'] = df['F']/sum(df['F']+df['M'])*100

df['M'] = df['M']/sum(df['M']+df['F'])*100

df['Age'] = ['0-17', '18-25','26-35', '36-45', '46-50', '51-55', '55-']

df.plot(kind='bar', x='Age', title = 'Gender distribution in Age groups', rot=0,figsize=(10,5), color=['turquoise','brown'], stacked=True)

enter image description here