我尝试使用PySAL重新创建R spgwr
笔记本。使用R,可以将局部系数直接提取到DataFrame中,如下所示:
(CSV可用here)
- [R
library(spgwr)
df <- read.csv("file.csv")
attach(df)
# calibrate bandwidth
bw <- gwr.sel(endog ~ x1+x2+x3, data=df, coords=cbind(x,y), adapt=T)
# fit model
gwr.model = gwr(endog ~ x1+x2+x3, data=df, coords=cbind(x,y), adapt=bw, hatmatrix=TRUE, se.fit=TRUE)
# build results DataFrame
results<-as.data.frame(gwr.model$SDF)
# results contains coefficients and standard errors:
head(results)
# columns: sum.w X.Intercept, x1, x2, x3 … pred.se_EDF x y
# attach coefficients to original dataframe
df$coef_x1<-results$x1
df$coef_x2<-results$x2
df$coef_x3<-results$x3
我是否可以使用ps.spreg.OLS
的Python
import pandas as pd
import pysal as ps
df = pd.read_csv("file.csv")
# build spatial weights
# this will give us different weights to R, but that's OK for now
# repackage variables for convenience
yxs = df.loc[:, ['endog', 'x1', 'x2', 'x3']]
spatial_weights = ps.knnW_from_array(
df.loc[yxs.index, ['x', 'y']].values
)
# Row-standarise the weights
spatial_weights.transform = 'R'
fit = ps.spreg.OLS(
df.endog.values[:, None],
df[['x1', 'x2', 'x3']].values,
w=spatial_weights,
spat_diag=True,
)
print(fit.summary)
REGRESSION
----------
SUMMARY OF OUTPUT: ORDINARY LEAST SQUARES
-----------------------------------------
Data set : unknown
Weights matrix : unknown
Dependent Variable : dep_var Number of Observations: 625
Mean dependent var : 345.8406 Number of Variables : 4
S.D. dependent var : 19.5388 Degrees of Freedom : 621
R-squared : 0.5163
Adjusted R-squared : 0.5139
Sum squared residual: 115234.766 F-statistic : 220.9246
Sigma-square : 185.563 Prob(F-statistic) : 1.68e-97
S.E. of regression : 13.622 Log likelihood : -2517.141
Sigma-square ML : 184.376 Akaike info criterion : 5042.283
S.E of regression ML: 13.5785 Schwarz criterion : 5060.034
------------------------------------------------------------------------------------
Variable Coefficient Std.Error t-Statistic Probability
------------------------------------------------------------------------------------
CONSTANT 361.8912240 3.0709296 117.8441935 0.0000000
var_1 -13.0082465 1.8701682 -6.9556560 0.0000000
var_2 -1.1944903 0.1067895 -11.1854632 0.0000000
var_3 23.3549680 2.1474242 10.8758055 0.0000000
------------------------------------------------------------------------------------
REGRESSION DIAGNOSTICS
MULTICOLLINEARITY CONDITION NUMBER 13.948
TEST ON NORMALITY OF ERRORS
TEST DF VALUE PROB
Jarque-Bera 2 2160.476 0.0000
DIAGNOSTICS FOR HETEROSKEDASTICITY
RANDOM COEFFICIENTS
TEST DF VALUE PROB
Breusch-Pagan test 3 75.484 0.0000
Koenker-Bassett test 3 13.942 0.0030
DIAGNOSTICS FOR SPATIAL DEPENDENCE
TEST MI/DF VALUE PROB
Lagrange Multiplier (lag) 1 43.543 0.0000
Robust LM (lag) 1 1.819 0.1775
Lagrange Multiplier (error) 1 46.989 0.0000
Robust LM (error) 1 5.264 0.0218
Lagrange Multiplier (SARMA) 2 48.808 0.0000
================================ END OF REPORT =====================================
我不确定如何使用fit
中包含的数据来计算局部系数。