Python中回归线上的R值的总和,最小值,最大值和平均值

时间:2019-07-14 06:51:55

标签: python statistics linear-regression

我有一个时序数据帧,其绘制如下(在下图中一起绘制了2个数据帧)

线图线是橙色和蓝色线。每个数据集的回归线或趋势线分别类似于橙色和蓝色线。

如何分别计算回归线上方和下方的R值的总和,最小值,最大值和均值(距回归线的距离)?那是在python中分别为正R值和负R值的总和,最小值,最大值和平均值。 对于我要尝试做的事情可能会有一个术语,但是我是统计学的新手,并不知道。有人可以引导我吗?

enter image description here

更新我拥有的数据如下所示(实际数据要长得多)。总体趋势有所下降,但两者之间有小幅上升。

Time	Values
101	20.402
102	20.302
103	20.202
104	20.102
105	20.002
106	19.902
107	19.802
108	19.702
109	19.602
110	19.502
111	19.402
112	19.302
113	19.202
114	20.337
115	20.437
116	20.537
117	18.802
118	18.702
119	18.602
120	18.502
121	18.402
122	18.302
123	18.202
124	18.102
125	18.002
126	17.902
127	17.802
128	17.702
129	17.602
130	17.502
131	18.502
132	18.402
133	18.302
134	17.702
135	17.602
136	17.502
137	17.402
138	17.302
139	17.202
140	17.102
141	17.002

1 个答案:

答案 0 :(得分:0)

此示例使用您发布的数据并分别计算正误差和负误差,不包括恰好为零的误差。

import numpy


xData = numpy.array([101.0, 102.0, 103.0, 104.0, 105.0, 106.0, 107.0, 108.0, 109.0, 110.0, 111.0, 112.0, 113.0, 114.0, 115.0, 116.0, 117.0, 118.0, 119.0, 120.0, 121.0, 122.0, 123.0, 124.0, 125.0, 126.0, 127.0, 128.0, 129.0, 130.0, 131.0, 132.0, 133.0, 134.0, 135.0, 136.0, 137.0, 138.0, 139.0, 140.0, 141.0])
yData = numpy.array([20.402, 20.302, 20.202, 20.102, 20.002, 19.902, 19.802, 19.702, 19.602, 19.502, 19.402, 19.302, 19.202, 20.337, 20.437, 20.537, 18.802, 18.702, 18.602, 18.502, 18.402, 18.302, 18.202, 18.102, 18.002, 17.902, 17.802, 17.702, 17.602, 17.502, 18.502, 18.402, 18.302, 17.702, 17.602, 17.502, 17.402, 17.302, 17.202, 17.102, 17.002])


polynomialOrder = 1 # example straight line

# curve fit the test data
fittedParameters = numpy.polyfit(xData, yData, polynomialOrder)
print('Fitted Parameters:', fittedParameters)

modelPredictions = numpy.polyval(fittedParameters, xData)
fitErrors = modelPredictions - yData

positiveErrors = []
negativeErrors = []

# this logic excludes errors of exactly zero
for error in fitErrors:
    if error < 0.0:
       negativeErrors.append(error)
    if error > 0.0:
       positiveErrors.append(error)

print('Positive error statistics:')
print('    sum =',  numpy.sum(positiveErrors))
print('    min =',  numpy.min(positiveErrors))
print('    max =',  numpy.max(positiveErrors))
print('    mean =', numpy.mean(positiveErrors))

print()

print('Negative error statistics:')
print('    sum =',  numpy.sum(negativeErrors))
print('    min =',  numpy.min(negativeErrors))
print('    max =',  numpy.max(negativeErrors))
print('    mean =', numpy.mean(negativeErrors))