我有一个时序数据帧,其绘制如下(在下图中一起绘制了2个数据帧)
线图线是细橙色和蓝色线。每个数据集的回归线或趋势线分别类似于粗橙色和蓝色线。
如何分别计算回归线上方和下方的R值的总和,最小值,最大值和均值(距回归线的距离)?那是在python中分别为正R值和负R值的总和,最小值,最大值和平均值。 对于我要尝试做的事情可能会有一个术语,但是我是统计学的新手,并不知道。有人可以引导我吗?
更新我拥有的数据如下所示(实际数据要长得多)。总体趋势有所下降,但两者之间有小幅上升。
Time Values
101 20.402
102 20.302
103 20.202
104 20.102
105 20.002
106 19.902
107 19.802
108 19.702
109 19.602
110 19.502
111 19.402
112 19.302
113 19.202
114 20.337
115 20.437
116 20.537
117 18.802
118 18.702
119 18.602
120 18.502
121 18.402
122 18.302
123 18.202
124 18.102
125 18.002
126 17.902
127 17.802
128 17.702
129 17.602
130 17.502
131 18.502
132 18.402
133 18.302
134 17.702
135 17.602
136 17.502
137 17.402
138 17.302
139 17.202
140 17.102
141 17.002
答案 0 :(得分:0)
此示例使用您发布的数据并分别计算正误差和负误差,不包括恰好为零的误差。
import numpy
xData = numpy.array([101.0, 102.0, 103.0, 104.0, 105.0, 106.0, 107.0, 108.0, 109.0, 110.0, 111.0, 112.0, 113.0, 114.0, 115.0, 116.0, 117.0, 118.0, 119.0, 120.0, 121.0, 122.0, 123.0, 124.0, 125.0, 126.0, 127.0, 128.0, 129.0, 130.0, 131.0, 132.0, 133.0, 134.0, 135.0, 136.0, 137.0, 138.0, 139.0, 140.0, 141.0])
yData = numpy.array([20.402, 20.302, 20.202, 20.102, 20.002, 19.902, 19.802, 19.702, 19.602, 19.502, 19.402, 19.302, 19.202, 20.337, 20.437, 20.537, 18.802, 18.702, 18.602, 18.502, 18.402, 18.302, 18.202, 18.102, 18.002, 17.902, 17.802, 17.702, 17.602, 17.502, 18.502, 18.402, 18.302, 17.702, 17.602, 17.502, 17.402, 17.302, 17.202, 17.102, 17.002])
polynomialOrder = 1 # example straight line
# curve fit the test data
fittedParameters = numpy.polyfit(xData, yData, polynomialOrder)
print('Fitted Parameters:', fittedParameters)
modelPredictions = numpy.polyval(fittedParameters, xData)
fitErrors = modelPredictions - yData
positiveErrors = []
negativeErrors = []
# this logic excludes errors of exactly zero
for error in fitErrors:
if error < 0.0:
negativeErrors.append(error)
if error > 0.0:
positiveErrors.append(error)
print('Positive error statistics:')
print(' sum =', numpy.sum(positiveErrors))
print(' min =', numpy.min(positiveErrors))
print(' max =', numpy.max(positiveErrors))
print(' mean =', numpy.mean(positiveErrors))
print()
print('Negative error statistics:')
print(' sum =', numpy.sum(negativeErrors))
print(' min =', numpy.min(negativeErrors))
print(' max =', numpy.max(negativeErrors))
print(' mean =', numpy.mean(negativeErrors))