如何通过线性回归拟合负数据?

时间:2019-02-10 18:04:30

标签: python linear-regression

我想拟合我的数据并提取其斜率。我使用线性回归。我的数据是一组包含负值的时钟偏移值。这是我的代码:

from scipy import stats
import scipy
import matplotlib.pyplot as plt
plt.style.use('ggplot')
x= [1549808191, 1549808192, 1549808196, 1549808201, 1549808202, 1549808206, 1549808207, 1549808214, 1549808215, 1549808221, 1549808226, 1549808267, 1549808272, 1549808290, 1549808304, 1549808315, 1549808324, 1549808332, 1549808355, 1549808395, 1549808396]
y= ['7', '0', '0', '0', '-2', '4', '-3', '2', '0', '-1', '0', '-2', '-1', '-1','2', '-2', '1', '0', '0', '-1', '-2']
print(x)
print(y)
plt.plot(x,y,'o-')
plt.show()
slope, intercept, r_value, p_value, std_err = scipy.stats.linregress(x, y)
print(slope)

enter image description here 但是,它给了我这个错误:

    ret = umr_sum(arr, axis, dtype, out, keepdims)
TypeError: cannot perform reduce with flexible type

那么,如何解决此错误?线性回归是用此类数据提取拟合参数的最佳方法吗?

4 个答案:

答案 0 :(得分:2)

问题似乎出在scipy.stats.linregress(x, y)处,因为您的y值是字符串,您在其中执行拟合。您可以使用map将它们转换为整数类型,一切按预期进行

# import commands here 
plt.style.use('ggplot')
x= [1549808191, 1549808192, 1549808196, 1549808201, 1549808202, 1549808206, 1549808207, 1549808214, 1549808215, 1549808221, 1549808226, 1549808267, 1549808272, 1549808290, 1549808304, 1549808315, 1549808324, 1549808332, 1549808355, 1549808395, 1549808396]
y= ['7', '0', '0', '0', '-2', '4', '-3', '2', '0', '-1', '0', '-2', '-1', '-1','2', '-2', '1', '0', '0', '-1', '-2']

plt.plot(x,y,'o-')
plt.show()
slope, intercept, r_value, p_value, std_err = scipy.stats.linregress(x, list(map(int, y)))
print("The slope is %s" %slope)

# The slope is -0.009607415773244879

enter image description here

答案 1 :(得分:1)

问题与解决方案


正如在其他答案中所说的那样,问题在于Y值是字符串。它部分地为您工作,因为 matplotlib 自动将您的Y字符串更改为数字。但是 scipy 库没有。因此,您需要将列表转换为数字。见下文

from scipy import stats
import scipy
import matplotlib.pyplot as plt
plt.style.use('ggplot')
x= [1549808191, 1549808192, 1549808196, 1549808201, 1549808202, 1549808206, 1549808207, 1549808214, 1549808215, 1549808221, 1549808226, 1549808267, 1549808272, 1549808290, 1549808304, 1549808315, 1549808324, 1549808332, 1549808355, 1549808395, 1549808396]
y= ['7', '0', '0', '0', '-2', '4', '-3', '2', '0', '-1', '0', '-2', '-1', '-1','2', '-2', '1', '0', '0', '-1', '-2']
y = [float(i) for i in y]
print(x)
print(y)
plt.plot(x,y,'o-')
plt.show()
slope, intercept, r_value, p_value, std_err = scipy.stats.linregress(x, y)
print(slope)

答案 2 :(得分:0)

看起来您的y是一个字符串列表。您需要将y的类型设置为整数或浮点数才能进行回归。

答案 3 :(得分:0)

将y更改为数字列表:y = [7, 0, 0, 0, -2, ...]

以这种方式工作

from scipy import stats
import scipy
import matplotlib.pyplot as plt
plt.style.use('ggplot')
x= [1549808191, 1549808192, 1549808196, 1549808201, 1549808202, 1549808206, 1549808207, 1549808214, 1549808215, 1549808221, 1549808226, 1549808267, 1549808272, 1549808290, 1549808304, 1549808315, 1549808324, 1549808332, 1549808355, 1549808395, 1549808396]
y= [1549808191, 1549808192, 1549808196, 1549808201, 1549808202, 1549808206, 1549808207, 1549808214, 1549808215, 1549808221, 1549808226, 1549808267, 1549808272, 1549808290, 1549808304, 1549808315, 1549808324, 1549808332, 1549808355, 1549808395, 1549808396]
print(y)
plt.plot(x,y,'o-')
plt.show()
slope, intercept, r_value, p_value, std_err = scipy.stats.linregress(x, y)
print(slope)

返回 1