Question

我想绘制数据点和估计模型之间的线（残差；青色线）。目前，我通过对收入pandas.DataFrame中的所有数据点进行迭代并添加垂直线来做到这一点。 x，y是点的坐标，predicted是预测（这里是蓝线）。

plt.scatter(income["Education"], income["Income"], c='red')
plt.ylim(0,100)

for indx, (x, y, _, _, predicted) in income.iterrows():
    plt.axvline(x, y/100, predicted/100) # /100 because it needs floats [0,1]

有没有更有效的方法？对于多行而言，这似乎不是一个好方法。

Answer 1

首先请注意，axvline仅在巧合的情况下起作用。通常，y所获取的axvline值是相对于轴的坐标，而不是数据坐标。

相反，vlines使用数据坐标，并且还具有接受值数组的优点。然后，它将创建一个LineCollection，它比单个行的效率更高。

import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(-1.2,1.2,20)
y = np.sin(x)
dy = (np.random.rand(20)-0.5)*0.5

fig, ax = plt.subplots()
ax.plot(x,y)
ax.scatter(x,y+dy)

ax.vlines(x,y,y+dy)

plt.show()

Python：在拟合模型上绘制残差

1 个答案: