Question

我在数据集上运行Logistic回归分类器，如下所示：

ID| feature1 | feature2 | feature3 | Match
0 |   6      |    9     |  9.5     |   1 
1 |   9      |    7     |  3.9     |   0
2 |   7      |    3     |  5.8     |   1

我的模型是y(match) = f(feature1, feature2, feature3)，其中y是二进制变量。我在python中运行以下代码：

df = pd.read_csv('abc.csv', encoding = 'latin-1')
X = pd.DataFrame()
X['match'] = df ['match']
X['feature1'] = df ['feature1']
X['feature2'] = df ['feature2']
X['feature3'] = df ['feature3']

X = X.dropna(axis=0)  # Drop NAs
y = X['match'].to_frame() # Categorical variable Match [Yes, No]
y = np.ravel(y) # Converting into 1-D array
X = X.drop(['match'], axis=1) # Drop y from X
X = X.as_matrix() # converting dataframe to numpy matrix

# Splitting into training and testing data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Applying logistic regression using sklearn
model_1 = LogisticRegression(penalty='l2', C=1)
model_1.fit(X_train, y_train)
model_1.predict(X_test)

上面的代码为model_1.predict（X_test）返回[0,0,0 ...，0,0,0]。我在很多地方检查过，但我发现我的代码没有错。它也会运行但会产生意想不到的结果请帮忙。

获取sklearn python中Logistic回归分类器的model.fit中的零列表

0 个答案: