如何强制零拦截以适应二阶函数? (蟒蛇)

时间:2017-01-20 19:49:37

标签: python statsmodels least-squares best-fit-curve

我正在尝试使用零截距拟合二阶函数。现在当我绘制它时,我得到一个带有y-int>的行。我试图通过函数输出一组点:

y**2 = 14.29566 * np.pi * x

y = np.sqrt(14.29566 * np.pi * x)

到两个数据集x和y,D = 3.57391553。我的拟合程序是:

z = np.polyfit(x,y,2)                         # Generate curve coefficients
p = np.poly1d(z)                              # Generate curve function

xp = np.linspace(0, catalog.tlimit, 10)       # generate input values

plt.scatter(x,y)
plt.plot(xp, p(xp), '-')
plt.show()

我也尝试过使用statsmodels.ols

mod_ols = sm.OLS(y,x)
res_ols = mod_ols.fit()

但我不明白如何为二阶函数生成系数而不是线性函数,也不知道如何将y-int设置为0.我看到另一个类似的帖子处理强制y-int为0线性拟合,但我无法弄清楚如何使用二阶函数。

现在的情节: I want to anchor the curve at (0,0)

数据:

x = [0., 0.00325492, 0.00650985, 0.00976477, 0.01301969, 0.01627462, 0.01952954, 0.02278447, 
     0.02603939, 0.02929431, 0.03254924, 0.03580416, 0.03905908, 0.04231401]
y = [0., 0.38233801, 0.5407076, 0.66222886, 0.76467602, 0.85493378, 0.93653303, 1.01157129,
     1.0814152, 1.14701403, 1.20905895, 1.26807172, 1.32445772, 1.3785393]

2 个答案:

答案 0 :(得分:2)

如果我理解正确你想要用OLS拟合多项式回归线,你可以尝试以下(我们可以看到,随着拟合多项式的程度增加,模型越来越多地适应数据) ):

xp = np.linspace(0, 0.05, 10)       # generate input values
pred1 = p(xp)  # prediction with np.poly as you have done

import pandas as pd
data = pd.DataFrame(data={'x':x, 'y':y})

# let's first fit 2nd order polynomial with OLS with intercept
olsres2 = sm.ols(formula = 'y ~ x + I(x**2)', data = data).fit()
print olsres2.summary()


                             OLS Regression Results
==============================================================================
Dep. Variable:                      y   R-squared:                       0.978
Model:                            OLS   Adj. R-squared:                  0.974
Method:                 Least Squares   F-statistic:                     243.1
Date:                Sat, 21 Jan 2017   Prob (F-statistic):           7.89e-10
Time:                        04:16:22   Log-Likelihood:                 20.323
No. Observations:                  14   AIC:                            -34.65
Df Residuals:                      11   BIC:                            -32.73
Df Model:                           2
Covariance Type:            nonrobust
==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
Intercept      0.1470      0.045      3.287      0.007         0.049     0.245
x             52.4655      4.907     10.691      0.000        41.664    63.267
I(x ** 2)   -580.4730    111.820     -5.191      0.000      -826.588  -334.358
==============================================================================
Omnibus:                        4.803   Durbin-Watson:                   1.164
Prob(Omnibus):                  0.091   Jarque-Bera (JB):                2.101
Skew:                          -0.854   Prob(JB):                        0.350
Kurtosis:                       3.826   Cond. No.                     6.55e+03
==============================================================================

pred2 = olsres2.predict(data) # predict with the fitted model
# fit 3rd order polynomial with OLS with intercept
olsres3 = sm.ols(formula = 'y ~ x + I(x**2) + I(x**3)', data = data).fit()
pred3 = olsres3.predict(data) # predict

plt.scatter(x,y)
plt.plot(xp, pred1, '-r', label='np.poly')
plt.plot(x, pred2, '-b', label='ols.o2')
plt.plot(x, pred3, '-g', label='ols.o3')
plt.legend(loc='upper left')
plt.show()

enter image description here

# now let's fit the polynomial regression lines this time without intercept
olsres2 = sm.ols(formula = 'y ~ x + I(x**2)-1', data = data).fit()
print olsres2.summary()
                                OLS Regression Results
==============================================================================
Dep. Variable:                      y   R-squared:                       0.993
Model:                            OLS   Adj. R-squared:                  0.992
Method:                 Least Squares   F-statistic:                     889.6
Date:                Sat, 21 Jan 2017   Prob (F-statistic):           9.04e-14
Time:                        04:16:24   Log-Likelihood:                 15.532
No. Observations:                  14   AIC:                            -27.06
Df Residuals:                      12   BIC:                            -25.79
Df Model:                           2
Covariance Type:            nonrobust
==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x             65.8170      3.714     17.723      0.000        57.726    73.908
I(x ** 2)   -833.6787    109.279     -7.629      0.000     -1071.777  -595.580
==============================================================================
Omnibus:                        1.716   Durbin-Watson:                   0.537
Prob(Omnibus):                  0.424   Jarque-Bera (JB):                1.341
Skew:                           0.649   Prob(JB):                        0.511
Kurtosis:                       2.217   Cond. No.                         118.
==============================================================================
pred2 = olsres2.predict(data)
# fit 3rd order polynomial with OLS without intercept
olsres3 = sm.ols(formula = 'y ~ x + I(x**2) + I(x**3) -1', data = data).fit()
pred3 = olsres3.predict(data)

plt.scatter(x,y)
plt.plot(xp, pred1, '-r', label='np.poly')
plt.plot(x, pred2, '-b', label='ols.o2')
plt.plot(x, pred3, '-g', label='ols.o3')
plt.legend(loc='upper left')
plt.show()

enter image description here

答案 1 :(得分:1)

此代码为您的问题提供了直接的答案。我必须承认,我首先尝试了这种语言(见https://patsy.readthedocs.io/en/latest/)。

这个模型的核心是: np.power(y,2)~x - 1 。这意味着在 x 上回归 y 的平方,省略截距。一旦估算了 x 的系数,则除以pi得到你想要的数字。

x = [0., 0.00325492, 0.00650985, 0.00976477, 0.01301969, 0.01627462, 0.01952954, 0.02278447, 0.02603939, 0.02929431, 0.03254924, 0.03580416, 0.03905908, 0.04231401]
y = [0., 0.38233801, 0.5407076, 0.66222886, 0.76467602, 0.85493378, 0.93653303, 1.01157129,  1.0814152, 1.14701403, 1.20905895, 1.26807172, 1.32445772, 1.3785393]

data = { 'x': x, 'y': y }

import statsmodels.formula.api as smf
import numpy as np

model = smf.ols(formula = 'np.power(y,2) ~ x - 1', data = data)
results = model.fit()
print (results.params)

predicteds = results.predict(data)

import matplotlib.pyplot as plt

plt.scatter(x,y)
plt.plot(x, [_**2 for _ in y], 'go')
plt.plot(x, predicteds, '-b')
plt.show()

回归系数为44.911147。这是图形结果。

plot

好的,我很怀疑。 ;)