我有以下数据框:
strterminationreason total_trials %Trials
0 Completed, Negative outcome/primary endpoint(s... 3130 6.390624
1 Completed, Outcome indeterminate 3488 7.121565
2 Completed, Outcome unknown 6483 13.236555
3 Completed, Positive outcome/primary endpoint(s... 15036 30.699498
4 Terminated, Business decision - Drug strategy ... 526 1.073952
5 Terminated, Business decision - Other 1340 2.735922
6 Terminated, Business decision - Pipeline repri... 1891 3.860917
7 Terminated, Early positive outcome 231 0.471640
8 Terminated, Lack of efficacy 1621 3.309649
9 Terminated, Lack of funding 533 1.088244
10 Terminated, Other 1253 2.558291
11 Terminated, Planned but never initiated 4441 9.067336
12 Terminated, Poor enrollment 3201 6.535587
13 Terminated, Safety/adverse effects 993 2.027441
14 Terminated, Unknown 4811 9.82277
我使用以下代码水平绘制条形图,因为正常条形图不适合上面的文本代码。
df['%Trials']=(df.ix[:,1]/sum(df.ix[:,1]))*100
plt.figure(figsize=(35,20))
plt.barh(df.ix[:,2],df.index,align='edge')
plt.xlim([0,31])
plt.yticks(df.index, df.strterminationreason)
plt.ylabel("TerminationReason",fontsize=20)
plt.xlabel("%Trials",fontsize=20)
但是我得到了输出,其中条形的范围不反映数据帧中的实际%值。就像最高百分比是完成,正结果/主要终点,但它没有显示相同。知道为什么吗?
还有人知道如何正确地在每个条形下方放置文本,以便没有重叠并且干净。
答案 0 :(得分:0)
您的绘图显示不正确的原因是您以相反的顺序将参数传递给barh
。您可以找到matplotlib.pyplot.barh
here的文档。这是一个稍微修改过的脚本,可以解决您的问题:
bottom = range(len(df.index))
width = df['%Trials']
fig = plt.figure(figsize=(10,8))
ax = fig.add_subplot(111)
ax.barh(bottom, width,color='r',align='edge')
ax.set_yticks(y_pos)
ax.set_yticklabels(df.index)
plt.show()