Question

我有一个带有年龄段伪变量的熊猫DataFrame，特别是'<35'，'35-44'，'45-54'，'55-64'和'65 +'。我还有另一个虚拟变量，它代表某人的头发是否变灰了，“灰？”。

我想绘制一个条形图，说明按年龄段划分的每1000人中有多少人染白发。因此，基本上，对于每个假人== 1的年龄组假人（灰色假人== 1 /年龄组假人== 1的人）* 1000，并绘制为条形图。

最好的方法是什么？

编辑：我最终想出了一种方法，但这可能不是最好的方法。

counts_list = []

for col in ['<35', '35-44', '45-54', '55-64', '65+']:
    counts_df = df.groupby(col)['grey?'].value_counts()
    try:
        counts_list.append(counts_df[1][1] / (counts_df[1][1] + counts_df[1][0]) * 1000)
    except:
        counts_list.append(0)
%matplotlib inline
import matplotlib.pyplot as plt; plt.rc("font", size=14)
y_pos = np.arange(len(['<35', '35-44', '45-54', '55-64', '65+']))

plt.bar(y_pos, counts_list, align='center', alpha=0.5)
plt.xticks(y_pos, ['<35', '35-44', '45-54', '55-64', '65+'])
plt.ylabel('Grey/1k')
plt.title('Grey by Age')

plt.show()

是否有更惯用的/ pythonic的方式来执行此操作？

Answer 1

我最终想出了一种方法，但这可能不是最好的方法。

counts_list = []

for col in ['<35', '35-44', '45-54', '55-64', '65+']:
    counts_df = df.groupby(col)['grey?'].value_counts()
    try:
        counts_list.append(counts_df[1][1] / (counts_df[1][1] + counts_df[1][0]) * 1000)
    except:
        counts_list.append(0)
%matplotlib inline
import matplotlib.pyplot as plt; plt.rc("font", size=14)
y_pos = np.arange(len(['<35', '35-44', '45-54', '55-64', '65+']))

plt.bar(y_pos, counts_list, align='center', alpha=0.5)
plt.xticks(y_pos, ['<35', '35-44', '45-54', '55-64', '65+'])
plt.ylabel('Grey/1k')
plt.title('Grey by Age')

plt.show()

虚拟变量== 1和系列中单独变量== 1的计数的条形图

1 个答案: