这个python代码是ramdomly生成" PerfectSeparationError"。这是一组50个随机[-2到2]点居中[-2,2]和50个随机[-2到2]居中[2,2]。前50分配为0,结果为1.我不明白为什么会出现这种错误,我认为分数足够分开得到结果
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import statsmodels.api as sm
from scipy import stats
stats.chisqprob = lambda chisq, df: stats.chi2.sf(chisq, df)
N = 100
D = 2
X = np.random.randn(N,D)
# center the first 50 points at (-2,-2)
X[:50,:] = X[:50,:] - 2*np.ones((50,D))
# center the last 50 points at (2, 2)
X[50:,:] = X[50:,:] + 2*np.ones((50,D))
# labels: first 50 are 0, last 50 are 1
T = np.array([0]*50 + [1]*50)
# add a column of ones
y = pd.Series(T.tolist())
Xb = pd.concat([pd.Series(X[:,0].tolist()), pd.Series(X[:,1].tolist())], axis=1)
logit_model=sm.Logit(y,Xb)
result=logit_model.fit()
print(result.summary())
什么是randomy显示的原因:" PerfectSeparationError:检测到完美分离,结果不可用"?