在matplolib

时间:2015-09-11 22:39:57

标签: python pandas matplotlib

我有以下数据框:

    strterminationreason    total_trials    %Trials
0   Completed, Negative outcome/primary endpoint(s...   3130    6.390624
1   Completed, Outcome indeterminate    3488    7.121565
2   Completed, Outcome unknown  6483    13.236555
3   Completed, Positive outcome/primary endpoint(s...   15036   30.699498
4   Terminated, Business decision - Drug strategy ...   526 1.073952
5   Terminated, Business decision - Other   1340    2.735922
6   Terminated, Business decision - Pipeline repri...   1891    3.860917
7   Terminated, Early positive outcome  231 0.471640
8   Terminated, Lack of efficacy    1621    3.309649
9   Terminated, Lack of funding 533 1.088244
10  Terminated, Other   1253    2.558291
11  Terminated, Planned but never initiated 4441    9.067336
12  Terminated, Poor enrollment 3201    6.535587
13  Terminated, Safety/adverse effects  993 2.027441
14  Terminated, Unknown 4811    9.82277

我使用以下代码水平绘制条形图,因为正常条形图不适合上面的文本代码。

df['%Trials']=(df.ix[:,1]/sum(df.ix[:,1]))*100

plt.figure(figsize=(35,20))
plt.barh(df.ix[:,2],df.index,align='edge')
plt.xlim([0,31])
plt.yticks(df.index, df.strterminationreason)
plt.ylabel("TerminationReason",fontsize=20)
plt.xlabel("%Trials",fontsize=20)

但是我得到了输出,其中条形的范围不反映数据帧中的实际%值。就像最高百分比是完成,正结果/主要终点,但它没有显示相同。知道为什么吗?

enter image description here

还有人知道如何正确地在每个条形下方放置文本,以便没有重叠并且干净。

1 个答案:

答案 0 :(得分:0)

您的绘图显示不正确的原因是您以相反的顺序将参数传递给barh。您可以找到matplotlib.pyplot.barh here的文档。这是一个稍微修改过的脚本,可以解决您的问题:

bottom = range(len(df.index))
width = df['%Trials']
fig = plt.figure(figsize=(10,8))
ax = fig.add_subplot(111)

ax.barh(bottom, width,color='r',align='edge')
ax.set_yticks(y_pos)
ax.set_yticklabels(df.index)

plt.show()

关于拟合长标签,您可能需要调整字体,填充或包裹线条以便于阅读。请参阅herehere