请,你能帮我吗?我有经过预处理,数字化和标准化的分类和字符串数据。我想使用此数据来预测连续值“ Montant TLPE”。我使用了相关矩阵中的数据来查找与“ Montant TLPE”相关的列,其比率为0.4%或更高。而且我发现适合数据的最佳模型是Lightgm模型,其准确性为47%。这是我的模型代码:
import lightgbm as lgbm
def runLGBM(train_X, train_y, test_X, seed_val=42):
params = {
'boosting_type': 'gbdt', 'objective': 'regression', 'nthread': -1, 'verbose': 0,
'num_leaves': 33, 'learning_rate': 0.05, 'max_depth': -1,
'subsample': 0.69, 'subsample_freq': 1, 'colsample_bytree': 0.5,
'reg_alpha': 1, 'reg_lambda': 0.002, 'metric': 'rmse',
'min_split_gain': 0.5, 'min_child_weight': 1, 'min_child_samples': 31, 'scale_pos_weight': 1}
#kf = KFold(n_splits=5, shuffle=True, random_state=seed_val)
pred_test_y = np.zeros(test_X.shape[0])
train_set = lgbm.Dataset(train_X, train_y, silent=True)
model = lgbm.train(params, train_set=train_set, num_boost_round=180)
pred_test_y = model.predict(test_X, num_iteration = model.best_iteration)
return pred_test_y , model
predictions, model = runLGBM(X_train, y_train, X_test, seed_val=42)
我该怎么做才能提高准确性?