我想用百分比的堆叠条形图表示此信息 在x轴上我想要年龄组,在y轴上我想要代表每个年龄组中性别百分比的值 年龄由数据集中的垃圾箱表示 我到目前为止
这是我的代码:
c = ds.groupby(['Age','Gender'])['Gender'].count()
d=(((c /c.groupby(level=0).sum())*100).round()).astype('int64')
d
答案 0 :(得分:0)
我创建了一个测试数据框:
df = pd.DataFrame({'Gender': ['F','M','F','F','F','M','M','M','F','F','M','F','F','M','M','M','M','F','F','M','M','M'], 'Age': [17,10,20,51,53,15,50,60,43,28,35,67,33,17,20,40,43,47,48,51,53,54]})
您可以使用pandas.cut函数将年龄划分为适当的时间间隔:
bins = pd.IntervalIndex.from_tuples([(0,17),(17,25),(25,35),(35,46),(46,50),(50,55), (55,np.inf)])
df['Age_interval'] = pd.cut(df['Age'], bins=bins)
df = df.groupby(['Age_interval', 'Gender']).size().unstack().fillna(0)
df['F'] = df['F']/sum(df['F']+df['M'])*100
df['M'] = df['M']/sum(df['M']+df['F'])*100
df['Age'] = ['0-17', '18-25','26-35', '36-45', '46-50', '51-55', '55-']
df.plot(kind='bar', x='Age', title = 'Gender distribution in Age groups', rot=0,figsize=(10,5), color=['turquoise','brown'], stacked=True)