我有一些数据正在从CSV文件中读取,并试图拟合贝叶斯岭回归模型,但出现Input contains NaN, infinity or a value too large for dtype('float64')
错误...任何提示,我们都非常感谢
import pandas as pd
import numpy as np
df = pd.read_csv('C:/Users/baselinekWh.csv', index_col='Date', parse_dates=True)
df
我认为不存在任何NaN数据,因为我可以使用matplotlib散点图绘制数据:
import matplotlib.pyplot as plt
plt.scatter(df['OSAT'], df['kWh'], color='grey', marker='+')
plt.xlabel('OSAT')
plt.ylabel('kWh')
plt.title('kWh Model')
plt.legend()
plt.show()
df.describe()`看起来像这样:
当我尝试拟合模型时,出现错误...有任何提示吗?我正在尝试按照Sci Kit Learn网站上的步骤进行操作。难道是我要指出的参数是问题吗?那里还没有很多智慧;) http://scikit-learn.org/stable/modules/linear_model.html#bayesian-ridge-regression
from sklearn import linear_model
X = df[list(set(df.columns).difference(['kWh']))].values # X-> features
Y = df[['kWh']].values # Y -> target
reg = linear_model.BayesianRidge(alpha_1=1e-06, alpha_2=1e-06, compute_score=False, copy_X=True,
fit_intercept=True, lambda_1=1e-06, lambda_2=1e-06, n_iter=300,
normalize=False, tol=0.001, verbose=False)
reg.fit(X, Y)