为什么这个Logit示例随机生成" PerfectSeparationError:检测到完美分离,结果不可用"

时间:2017-12-13 21:26:37

标签: python statsmodels

这个python代码是ramdomly生成" PerfectSeparationError"。这是一组50个随机[-2到2]点居中[-2,2]和50个随机[-2到2]居中[2,2]。前50分配为0,结果为1.我不明白为什么会出现这种错误,我认为分数足够分开得到结果

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import statsmodels.api as sm
from scipy import stats
stats.chisqprob = lambda chisq, df: stats.chi2.sf(chisq, df)

N = 100
D = 2

X = np.random.randn(N,D)

# center the first 50 points at (-2,-2)
X[:50,:] = X[:50,:] - 2*np.ones((50,D))

# center the last 50 points at (2, 2)
X[50:,:] = X[50:,:] + 2*np.ones((50,D))

# labels: first 50 are 0, last 50 are 1
T = np.array([0]*50 + [1]*50)

# add a column of ones

y = pd.Series(T.tolist())
Xb = pd.concat([pd.Series(X[:,0].tolist()), pd.Series(X[:,1].tolist())], axis=1)

logit_model=sm.Logit(y,Xb)
result=logit_model.fit()
print(result.summary())

什么是randomy显示的原因:" PerfectSeparationError:检测到完美分离,结果不可用"?

0 个答案:

没有答案