Question

    hr  Time0   Time1   Day Time2   cluster1    cluster3    
0   20  11/4/2017 20:39 Night   3   0   Entertainment   2   
1   1   21/03/2017 01:33:48 Night   3   0   Work    1   
2   22  16/03/2017 22:26:15 Night   5   0   Work    1   
3   2   2/4/2017 2:03   Night   1   0   Work    1   
4   2   2/4/2017 2:03   Night   1   0   Work    1   
5   2   2/4/2017 2:03   Night   1   0   Work    1   
6   19  8/4/2017 19:02  Night   7   0   Entertainment   2   
7   11  17/03/2017 11:17:19 Day 6   1   Entertainment   2   
8   22  16/03/2017 22:28:58 Night   5   0   Work    1   
9   2   2/4/2017 2:03   Night   1   0   Work    1   
10  2   2/4/2017 2:03   Night   1   0   Work    1   
11  2   2/4/2017 2:03   Night   1   0   Work    1   
12  2   2/4/2017 2:03   Night   1   0   Work    1   
13  2   2/4/2017 2:03   Night   1   0   Work    1   
14  0   5/4/2017 0:46   Night   4   0   Entertainment   2   
15  0   5/4/2017 0:46   Night   4   0   Entertainment   2   
16  20  11/4/2017 20:37 Night   3   0   Entertainment   2

根据我的数据集，我已经执行了逻辑回归并想要预测与hr相关的集群 - 但是这段代码总是预测一个集群＃1的符号。

这是我的代码：

import csv
import pandas as pd
import numpy as np
import statsmodels.api as sm
import matplotlib.pyplot as plt

noti=pd.read_csv('C:\\path\\to\\Final.csv', index_col=0)
Time=[]
Group=[]
NewTime=[]
NewGroup=[]
with open('C:\\path\\to\\Final.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter=',')   
    for row in readCSV:
        Time.append(row[1])
        Group.append(row[8])
for i in Time[1:]:
    NewTime.append(i)
for i in Group[1:]:
    NewGroup.append(i)

X=pd.DataFrame(NewTime)
X.columns = ['Time']
y=pd.DataFrame(NewGroup)
y.columns=['Group']
print(X)
print(y)
from sklearn.linear_model import LogisticRegression

# Create logistic regression object
model = LogisticRegression()
# Train the model using the training sets and check score
model.fit(X, y)
model.score(X, y)
#Equation coefficient and Intercept
print('Coefficient: \n', model.coef_)
print('Intercept: \n', model.intercept_)
#predicted= model.predict(X)
#print(predicted)
res = model.predict(X)
print(pd.DataFrame(res))

为什么我的逻辑回归始终预测相同的标签？

0 个答案: