我正在尝试为我的数据训练一个LASSO和RIDGE模型,我正在运行一个运行时警告我认为与除零有关。不幸的是,我不确定为什么会这样。我正在过滤我的功能以排除方差等于零的那些,但是这个错误仍然会在初始for循环中不时出现。因为错误总是不会出现,所以我认为它与数据如何分成训练和测试集有关。
X
大约有900列(要素)和90行(样本),y
有90行和1列。
start = time.time()
alphas = 10**np.linspace(10,-2,100)*0.5
ridgecv = RidgeCV(alphas=alphas, scoring='neg_mean_squared_error', cv=5, normalize=True)
lassocv = LassoCV(alphas=None, cv=5, max_iter=100000, normalize=True, n_jobs=6, verbose=False)
for i in range(0, len(bin_info.index)):
idx = int(bin_info.iloc[[i]].T.columns.values)
y = chip.iloc[idx].T
# split
X_train, X_test , y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=None)
# remove features with zero variance
X_train = X_train.loc[:, X_train.var() > 0]
X_test = X_test[X_train.columns.values]
# fit
ridgecv.fit(X_train, y_train.values.ravel())
lassocv.fit(X_train, y_train.values.ravel())
# predict
r_predicted = ridgecv.predict(X_test)
l_predicted = lassocv.predict(X_test)
任何关于为什么会发生这种情况的见解将不胜感激。
编辑:确切的警告发布在下面:
/Users/ss/miniconda3/lib/python3.6/site-packages/numpy/lib/function_base.py:3184: RuntimeWarning: invalid value encountered in true_divide
c /= stddev[None, :]