这可以成为Hosmer-Lemeshow的一项巧妙功能。

Question

我估计python中有一个glm。我怎样才能表现出Hosmer-Lemeshow的善良

在python中对这个模型的拟合测试？

Answer 1

我找到了一种方法，代码不是最好的质量，但它有效：

import pandas as pd
import numpy as np
from scipy.stats import chi2
pihat=model.predict()
pihatcat=pd.cut(pihat, np.percentile(pihat,[0,25,50,75,100]),labels=False,include_lowest=True) #here I've chosen only 4 groups


meanprobs =[0]*4 
expevents =[0]*4
obsevents =[0]*4 
meanprobs2=[0]*4 
expevents2=[0]*4
obsevents2=[0]*4 

for i in range(4):
   meanprobs[i]=np.mean(pihat[pihatcat==i])
   expevents[i]=np.sum(pihatcat==i)*np.array(meanprobs[i])
   obsevents[i]=np.sum(data.r[pihatcat==i])
   meanprobs2[i]=np.mean(1-pihat[pihatcat==i])
   expevents2[i]=np.sum(pihatcat==i)*np.array(meanprobs2[i])
   obsevents2[i]=np.sum(1-data.r[pihatcat==i]) 


data1={'meanprobs':meanprobs,'meanprobs2':meanprobs2}
data2={'expevents':expevents,'expevents2':expevents2}
data3={'obsevents':obsevents,'obsevents2':obsevents2}
m=pd.DataFrame(data1)
e=pd.DataFrame(data2)
o=pd.DataFrame(data3)

tt=sum(sum((np.array(o)-np.array(e))**2/np.array(e))) #the statistic for the test, which follows,under the null hypothesis, the chi-squared distribution with degrees of freedom equal to amount of groups - 2 
pvalue=1-chi2.cdf(tt,2)
pvalue

Answer 2

将matplotlib.pyplot导入为plt

将熊猫作为pd导入将numpy导入为np 从scipy.stats导入chi2

这可以成为Hosmer-Lemeshow的一项巧妙功能。

def HosmerLemeshow（型号，Y）： pihat = model.predict（） pihatcat = pd.cut（pihat，np.percentile（pihat，[0,25,50,75,100]），labels = False，include_lowest = True）＃这里我们只选择了4组

meanprobs =[0]*4 
expevents =[0]*4
obsevents =[0]*4 
meanprobs2=[0]*4 
expevents2=[0]*4
obsevents2=[0]*4 

for i in range(4):
   meanprobs[i]=np.mean(pihat[pihatcat==i])
   expevents[i]=np.sum(pihatcat==i)*np.array(meanprobs[i])
   obsevents[i]=np.sum(Y[pihatcat==i])
   meanprobs2[i]=np.mean(1-pihat[pihatcat==i])
   expevents2[i]=np.sum(pihatcat==i)*np.array(meanprobs2[i])
   obsevents2[i]=np.sum(1-Y[pihatcat==i]) 


data1={'meanprobs':meanprobs,'meanprobs2':meanprobs2}
data2={'expevents':expevents,'expevents2':expevents2}
data3={'obsevents':obsevents,'obsevents2':obsevents2}
m=pd.DataFrame(data1)
e=pd.DataFrame(data2)
o=pd.DataFrame(data3)

# The statistic for the test, which follows, under the null hypothesis,
# The chi-squared distribution with degrees of freedom equal to amount of groups - 2. Thus 4 - 2 = 2
tt=sum(sum((np.array(o)-np.array(e))**2/np.array(e))) 
pvalue=1-chi2.cdf(tt,2)

return pd.DataFrame([[chi2.cdf(tt,2).round(2), pvalue.round(2)]],columns = ["Chi2", "p - value"])

HosmerLemeshow（glm_full，Y）

在Python中，Hosmer-Lemeshow Fit测试的优点

2 个答案:

这可以成为Hosmer-Lemeshow的一项巧妙功能。