我想将绝对频率转换为百分比(0%-100%),但在我的图表中我看到的数字如70000%等。我还想在每个堆叠区域显示%,例如bar1 - > 35%vs 65%等
def to_percent(y, position):
s = str(100 * y)
if matplotlib.rcParams['text.usetex'] is True:
return s + r'$\%$'
else:
return s + '%'
filter = df["CLUSTER"] == 1
plt.ylabel("Percentage")
plt.hist([df["HOUR"][filter],df["HOUR"][~filter]],stacked=True,
color=['#8A2BE2', '#EE3B3B'], label=['1','0'])
plt.legend()
formatter = FuncFormatter(to_percent)
# Set the formatter
plt.gca().yaxis.set_major_formatter(formatter)
plt.show()
小小的虚假数据集:
s_hour = pd.Series(["5","5","5","8","8","9","10"])
s_cluster = pd.Series(["1","1","0","1","0","1","0"])
df = pd.concat([s_hour, s_cluster], axis=1)
df
答案 0 :(得分:0)
首先,将plt.hist的标准参数设置为True以标准化数据。
plt.hist([df["HOUR"][filter],df["HOUR"][~filter]],stacked=True, normed = True,
color=['#8A2BE2', '#EE3B3B'], label=['1','0'])
无法在格式函数中评论您的rcParams
条件 - 但此格式字符串有效 - '{:3.1%}'
。
看看这是不是你想要的:
def to_percent(y, p):
return '{:3.1%}'.format(y)
s_hour = pd.Series(["5","5","5","8","8","9","10"])
s_cluster = pd.Series(["1","1","0","1","0","1","0"])
df = pd.concat([s_hour.astype(np.int32), s_cluster.astype(np.int32)], axis=1)
df.columns = ["HOUR", "CLUSTER"]
filter = df["CLUSTER"] == 1
plt.ylabel("Percentage")
plt.hist([df["HOUR"][filter],df["HOUR"][~filter]],stacked=True, normed = True,
color=['#8A2BE2', '#EE3B3B'], label=['1','0'])
plt.legend()
formatter = FuncFormatter(to_percent)
# Set the formatter
plt.gca().yaxis.set_major_formatter(formatter)
plt.show()
plt.close()