我正在使用linearRegression预测值。我正在构建用户登录预测系统,这是每个用户的登录时间。
我的数据集包含以下值
login_hour Login_minute Login_second Login_day
11 20 30 2
11 21 45 2
11 45 10 2
有成千上万个这样的值,我正在尝试对此进行预测。我正在使用Sklearn库进行预测,但是我的输出值包含浮点数,十进制后没有任何内容。即使数据集中没有负数,一些数字也是负数。
下面是我的代码
model = LinearRegression()
try:
for users in user_list:
data = df.loc[df['user_list'] == users, 'login_hour':'login_day']
X_train, X_test = train_test_split(data, test_size=0.5, random_state=int(time.time()))
model.fit(X_train, X_test)
if not df.loc[df['user_list'] == users].empty:
X_predict = df.loc[df['user_list'] == users, 'login_hour':'login_day']
print(users)
print(model.predict(X_predict))
except Exception as e:
print(e)
下面是输出。
User1
[[19.20795654 30.81796908 -1.17348934 4. ]
[19.20795654 30.81796908 -1.17348934 4. ]
[19.20795654 30.81796908 -1.17348934 4. ]
[19.24304221 38.74216465 35.94425407 4. ]
[19.29593815 36.93326369 40.6377267 4. ]
[18.88357989 53.14036774 11.89626968 4. ]
[18.88357989 53.14036774 11.89626968 4. ]
[19.29593815 36.93326369 40.6377267 4. ]
[19.29593815 36.93326369 40.6377267 4. ]
[18.88357989 53.14036774 11.89626968 4. ]
[19.43300738 28.03476807 37.10549659 4. ]
[19.43300738 28.03476807 37.10549659 4. ]
[19.43300738 28.03476807 37.10549659 4. ]
[19. 50. 51. 4. ]
[19. 50. 51. 4. ]
[19. 50. 51. 4. ]]
User2
[[ 1.96245603e+01 8.73215646e+00 5.32679614e+00 4.00000000e+00]
[ 1.96245603e+01 8.73215646e+00 5.32679614e+00 4.00000000e+00]
[ 1.90995539e+01 3.32999402e+01 2.32454303e+01 4.00000000e+00]
[ 1.90995539e+01 3.32999402e+01 2.32454303e+01 4.00000000e+00]
[ 2.05057605e+01 -2.92241595e+01 -1.76151096e+01 4.00000000e+00]
[ 1.97164859e+01 1.09897454e+01 1.64568207e+01 4.00000000e+00]
[ 1.87643468e+01 3.99670330e+01 1.50683761e+01 4.00000000e+00]
[ 1.88341735e+01 3.99791100e+01 1.98189050e+01 4.00000000e+00]
[ 1.91961218e+01 3.78000218e+01 3.95673183e+01 4.00000000e+00]
[ 1.91006661e+01 4.22668915e+01 4.28252518e+01 4.00000000e+00]
[ 1.82946119e+01 7.81641824e+01 9.76385715e+01 4.00000000e+00]
[ 2.00000000e+01 -4.54747351e-13 4.30000000e+01 4.00000000e+00]
[ 2.00000000e+01 -4.54747351e-13 4.30000000e+01 4.00000000e+00]
[ 2.00000000e+01 -4.54747351e-13 4.30000000e+01 4.00000000e+00]]
问题
您会看到诸如4.
和50.
之类的值,为什么它们在小数点后丢失值,我如何才能将它们取到4.0
之类。还有可能将输出限制为仅两位小数,甚至零位吗?
我的某些输出值也为负,例如-4.54747351e-13
。我在数据集中没有任何负值,但我的输出却有。谁能给我原因吗?