从spreg.OLS结果中提取系数

时间:2017-07-28 14:17:03

标签: python r pandas pysal spgwr

我尝试使用PySAL重新创建R spgwr笔记本。使用R,可以将局部系数直接提取到DataFrame中,如下所示: (CSV可用here

- [R

library(spgwr)
df <- read.csv("file.csv")
attach(df)
# calibrate bandwidth
bw <- gwr.sel(endog ~ x1+x2+x3, data=df, coords=cbind(x,y), adapt=T)
# fit model
gwr.model = gwr(endog ~ x1+x2+x3, data=df, coords=cbind(x,y), adapt=bw, hatmatrix=TRUE, se.fit=TRUE)
# build results DataFrame
results<-as.data.frame(gwr.model$SDF)
# results contains coefficients and standard errors:
head(results)
# columns: sum.w X.Intercept, x1, x2, x3 … pred.se_EDF x y
# attach coefficients to original dataframe
df$coef_x1<-results$x1
df$coef_x2<-results$x2
df$coef_x3<-results$x3

我是否可以使用ps.spreg.OLS

的结果计算这些系数的简单方法

的Python

import pandas as pd
import pysal as ps

df = pd.read_csv("file.csv")
# build spatial weights
# this will give us different weights to R, but that's OK for now
# repackage variables for convenience
yxs = df.loc[:, ['endog', 'x1', 'x2', 'x3']]

spatial_weights = ps.knnW_from_array(
    df.loc[yxs.index, ['x', 'y']].values
)
# Row-standarise the weights
spatial_weights.transform = 'R'

fit = ps.spreg.OLS(
    df.endog.values[:, None],
    df[['x1', 'x2', 'x3']].values,
    w=spatial_weights,
    spat_diag=True,
)
print(fit.summary)

REGRESSION
----------
SUMMARY OF OUTPUT: ORDINARY LEAST SQUARES
-----------------------------------------
Data set            :     unknown
Weights matrix      :     unknown
Dependent Variable  :     dep_var                Number of Observations:         625
Mean dependent var  :    345.8406                Number of Variables   :           4
S.D. dependent var  :     19.5388                Degrees of Freedom    :         621
R-squared           :      0.5163
Adjusted R-squared  :      0.5139
Sum squared residual:  115234.766                F-statistic           :    220.9246
Sigma-square        :     185.563                Prob(F-statistic)     :    1.68e-97
S.E. of regression  :      13.622                Log likelihood        :   -2517.141
Sigma-square ML     :     184.376                Akaike info criterion :    5042.283
S.E of regression ML:     13.5785                Schwarz criterion     :    5060.034

------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     t-Statistic     Probability
------------------------------------------------------------------------------------
            CONSTANT     361.8912240       3.0709296     117.8441935       0.0000000
               var_1     -13.0082465       1.8701682      -6.9556560       0.0000000
               var_2      -1.1944903       0.1067895     -11.1854632       0.0000000
               var_3      23.3549680       2.1474242      10.8758055       0.0000000
------------------------------------------------------------------------------------

REGRESSION DIAGNOSTICS
MULTICOLLINEARITY CONDITION NUMBER           13.948

TEST ON NORMALITY OF ERRORS
TEST                             DF        VALUE           PROB
Jarque-Bera                       2        2160.476           0.0000

DIAGNOSTICS FOR HETEROSKEDASTICITY
RANDOM COEFFICIENTS
TEST                             DF        VALUE           PROB
Breusch-Pagan test                3          75.484           0.0000
Koenker-Bassett test              3          13.942           0.0030

DIAGNOSTICS FOR SPATIAL DEPENDENCE
TEST                           MI/DF       VALUE           PROB
Lagrange Multiplier (lag)         1          43.543           0.0000
Robust LM (lag)                   1           1.819           0.1775
Lagrange Multiplier (error)       1          46.989           0.0000
Robust LM (error)                 1           5.264           0.0218
Lagrange Multiplier (SARMA)       2          48.808           0.0000

================================ END OF REPORT =====================================

我不确定如何使用fit中包含的数据来计算局部系数。

0 个答案:

没有答案