Question

鉴于regplot在间隔和自举中计算均值以找到每个bin的置信区间，因此不得不手动重新计算它们以进行进一步研究似乎是一种浪费，所以：

问题：如何获取重新绘制的均值和置信区间？

示例：此代码可生成带有CI的bin平均值图。

import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

# just some random numbers to get started
fig, ax = plt.subplots()
x = np.random.uniform(-2, 2, 1000)
y = np.random.normal(x**2, np.abs(x) + 1)

# Manual binning to retain control
binwidth=4./10
x_bins=np.arange(-2+binwidth/2,2,binwidth)
sns.regplot(x=x, y=y, x_bins=x_bins, fit_reg=None)
plt.show()

结果： Regplot showing binned data w. CIs

并不是很容易逐个容器地计算均值，但是CI是使用随机数计算的。能够获得与绘制的数字完全相同的数字会很好，所以我该如何访问它们呢？我一定会忽略某种get_ *方法。

Answer 1

在seaborn的源代码中一定值得挖掘一下，看看它们如何计算置信区间，这并不难找到。

无论如何，都可以直接读取图形的置信区间和平均值。

置信区间

我们可以先从句柄开始：

lines = plt.gca().lines

优秀，剩下要做的就是遍历行并获得行中的最小值和最大值（分别为上下限）

lower = [line.get_ydata().min() for line in lines]
upper = [line.get_ydata().max() for line in lines]

我们可以通过将上下CI绘制为红色十字来检查是否可行：

plt.scatter(x_bins, lower, marker='x', color='C3', zorder=3)
plt.scatter(x_bins, upper, marker='x', color='C3', zorder=3)

手段

我们可以将均值提取为：

means = ax.collections[0].get_offsets()[:, 1]

我们也将这些方法添加到我们的绘图中

plt.scatter(x_bins, means, color='C1', marker='x', zorder=3)

这一起生成了下面的图，表明我们已经成功提取了正确的数据。

如何从Seaborn regplot访问数据点和置信区间？

1 个答案:

置信区间

手段