关于boxplot胡须计算与numpy或matplotlib

时间:2016-04-05 10:10:21

标签: python matplotlib boxplot

我试图计算matplotlib中boxplot的胡须和盒子坐标。我不明白我的错误以及为什么我不计算相同的值。

Q1, median, Q3 = np.percentile(becher, [25, 50, 75])
IQR = Q3 - Q1
Qs = [Q1, median, Q3, Q1 - 1.5 * IQR, Q3 + 1.5 * IQR]
Qname = ["Q1", "median", "Q3", "Q1-1.5xIQR", "Q3+1.5xIQR"]
for Q, name in zip(Qs, Qname):
    plt.axhline(Q, color="k")
    plt.text(1.52, Q, name)
plt.boxplot(becher)

如下图所示,Q1,Q3和中位数都可以。但胡须是错误的。

enter image description here

以下是我的数据:

becher = [9.1495,
 9.9479,
 9.7933,
 9.8002,
 8.47,
 9.14,
 9.06,
 9.6933,
 9.7871,
 10.5676,
 9.7441,
 10.4874,
 7.9584,
 7.9598,
 8.3483,
 7.2536,
 9.0823,
 10.8343,
 10.4104,
 7.2004,
 9.6297,
 9.96,
 9.761,
 9.684,
 8.6062,
 10.2098,
 8.9002,
 8.4511,
 9.3335,
 9.34946,
 8.0319,
 7.6379,
 7.8435,
 8.7572,
 8.0516,
 8.4134,
 10.0623,
 9.6406,
 9.0502,
 10.6821,
 11.1951,
 11.1876,
 10.0111,
 8.8456,
 10.2769,
 9.3939,
 11.3178,
 9.397,
 9.9851,
 9.9921,
 10.1132,
 8.9775,
 10.499,
 11.209,
 10.66,
 10.2704,
 10.9543,
 10.6529,
 10.9925,
 9.6625,
 7.8673,
 9.0023,
 8.9538,
 9.3961,
 8.8799,
 9.3722,
 10.697,
 9.808,
 9.894,
 9.5648,
 10.2994,
 9.0708,
 9.2368,
 8.8131,
 8.3218,
 10.1733,
 9.5885,
 10.7685,
 9.2015,
 9.881,
 9.4362,
 9.9686,
 9.3,
 9.979,
 9.896,
 10.05,
 9.9113,
 8.533,
 9.68297]

1 个答案:

答案 0 :(得分:5)

还有另一项调整,在较旧的文档字符串中更明确,例如,来自Matplotlib v1.3.1:

*whis* : [ default 1.5 ]
  Defines the length of the whiskers as a function of the inner
  quartile range.  They extend to the most extreme data point
  within ( ``whis*(75%-25%)`` ) data range.

因此胡须扩展到实际数据点。在您的情况下,您可以通过在脚本中添加几行来看到这一点:

Q1, median, Q3 = np.percentile(np.asarray(becher), [25, 50, 75])
IQR = Q3 - Q1

loval = Q1 - 1.5 * IQR
hival = Q3 + 1.5 * IQR

wiskhi = np.compress(becher <= hival, becher)
wisklo = np.compress(becher >= loval, becher)
actual_hival = np.max(wiskhi)
actual_loval = np.min(wisklo)

Qs = [Q1, median, Q3, loval, hival, actual_loval, actual_hival]
Qname = ["Q1", "median", "Q3", "Q1-1.5xIQR", "Q3+1.5xIQR", 
         "Actual LO", "Actual HI"]

for Q, name in zip(Qs, Qname):
    plt.axhline(Q, color="k")
    plt.text(1.52, Q, name)
plt.boxplot(becher)