我是python的新手我正在尝试进行伽玛回归,我希望获得与R类似的估计,但我无法理解python的语法,它会产生错误,一些如何解决它的想法。 / p>
我的R代码:
set.seed(1)
y = rgamma(18,10,.1)
print(y)
[1] 76.67251 140.40808 138.26660 108.20993 53.46417 110.61754 119.11950 113.57558 85.82045 71.96892
[11] 76.81693 86.00139 93.62010 69.49795 121.99775 114.18707 125.43608 120.63640
# Option 1
model = glm(y~1,family=Gamma)
summary(model)
# Option 2
# x = rep(1,18)
# summary(glm(y~x,family=Gamma))
输出:
summary(model)
Call:
glm(formula = y ~ 1, family = Gamma)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.57898 -0.24017 0.07637 0.17489 0.34345
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.009856 0.000581 16.96 4.33e-12 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Gamma family taken to be 0.06255708)
Null deviance: 1.1761 on 17 degrees of freedom
Residual deviance: 1.1761 on 17 degrees of freedom
AIC: 171.3
Number of Fisher Scoring iterations: 4
Python代码
y = [76.67251,140.40808,138.26660,108.20993,53.46417,110.61754,
119.11950,113.57558,85.82045,71.96892,76.81693,86.00139,
93.62010,69.49795,121.99775,114.18707,125.43608,120.63640]
x = np.repeat(1,18)
import numpy
import statsmodels.api as sm
model = sm.GLM(x,y, family=sm.families.Gamma()).fit()
print(model.summary())
我期望输出类似于R
答案 0 :(得分:3)
您需要在python代码中更改x和y变量的顺序,然后您将看到完全相同的结果(尽管输出中显示的有效数字的数量与R中的输出不同:
sm.GLM(y,x, family=sm.families.Gamma()).fit().summary()
<class 'statsmodels.iolib.summary.Summary'>
"""
Generalized Linear Model Regression Results
==============================================================================
Dep. Variable: y No. Observations: 18
Model: GLM Df Residuals: 17
Model Family: Gamma Df Model: 0
Link Function: inverse_power Scale: 0.0625558699706
Method: IRLS Log-Likelihood: -83.656
Date: Sun, 20 May 2018 Deviance: 1.1761
Time: 17:59:04 Pearson chi2: 1.06
No. Iterations: 4
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
const 0.0099 0.001 16.963 0.000 0.009 0.011
==============================================================================
"""
各种python包都有自己的语法。这是一个很好的链接,其中包含一些如何在Python中使用公式语法的示例: http://www.statsmodels.org/dev/example_formulas.html enter link description here
答案 1 :(得分:1)
这是使用公式的另一种方式,为此您需要导入statsmodels.formula.api
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf
y = [76.67251,140.40808,138.26660,108.20993,53.46417,110.61754,
119.11950,113.57558,85.82045,71.96892,76.81693,86.00139,
93.62010,69.49795,121.99775,114.18707,125.43608,120.63640]
df = pd.DataFrame({'y':y})
model = smf.glm(formula = 'y ~ 1', data = df, family=sm.families.Gamma()).fit()
model.summary()
<class 'statsmodels.iolib.summary.Summary'>
"""
Generalized Linear Model Regression Results
==============================================================================
Dep. Variable: y No. Observations: 18
Model: GLM Df Residuals: 17
Model Family: Gamma Df Model: 0
Link Function: inverse_power Scale: 0.062556
Method: IRLS Log-Likelihood: -83.656
Date: Sun, 20 May 2018 Deviance: 1.1761
Time: 22:00:54 Pearson chi2: 1.06
No. Iterations: 6 Covariance Type: nonrobust
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
Intercept 0.0099 0.001 16.963 0.000 0.009 0.011
==============================================================================
"""