科学工具包学习回归,值对于dtype('float64')错误太大

时间:2018-10-22 18:18:43

标签: python machine-learning scikit-learn linear-regression data-science

我有一些数据正在从CSV文件中读取,并试图拟合贝叶斯岭回归模型,但出现Input contains NaN, infinity or a value too large for dtype('float64')错误...任何提示,我们都非常感谢

import pandas as pd
import numpy as np

df = pd.read_csv('C:/Users/baselinekWh.csv', index_col='Date', parse_dates=True)

df

我认为不存在任何NaN数据,因为我可以使用matplotlib散点图绘制数据:

import matplotlib.pyplot as plt

plt.scatter(df['OSAT'], df['kWh'], color='grey', marker='+')


plt.xlabel('OSAT')
plt.ylabel('kWh')
plt.title('kWh Model')

plt.legend()

plt.show()

enter image description here

df.describe()`看起来像这样:

enter image description here

当我尝试拟合模型时,出现错误...有任何提示吗?我正在尝试按照Sci Kit Learn网站上的步骤进行操作。难道是我要指出的参数是问题吗?那里还没有很多智慧;) http://scikit-learn.org/stable/modules/linear_model.html#bayesian-ridge-regression

from sklearn import linear_model

X = df[list(set(df.columns).difference(['kWh']))].values # X-> features
Y = df[['kWh']].values # Y -> target

reg = linear_model.BayesianRidge(alpha_1=1e-06, alpha_2=1e-06, compute_score=False, copy_X=True,
       fit_intercept=True, lambda_1=1e-06, lambda_2=1e-06, n_iter=300,
       normalize=False, tol=0.001, verbose=False)

reg.fit(X, Y)

0 个答案:

没有答案