Question

请，你能帮我吗？我有经过预处理，数字化和标准化的分类和字符串数据。我想使用此数据来预测连续值“ Montant TLPE”。我使用了相关矩阵中的数据来查找与“ Montant TLPE”相关的列，其比率为0.4％或更高。而且我发现适合数据的最佳模型是Lightgm模型，其准确性为47％。这是我的模型代码：

import lightgbm as lgbm
def runLGBM(train_X, train_y, test_X, seed_val=42):
    params = {
        'boosting_type': 'gbdt', 'objective': 'regression', 'nthread': -1, 'verbose': 0,
        'num_leaves': 33, 'learning_rate': 0.05, 'max_depth': -1,
        'subsample': 0.69, 'subsample_freq': 1, 'colsample_bytree': 0.5, 
        'reg_alpha': 1, 'reg_lambda': 0.002, 'metric': 'rmse',
        'min_split_gain': 0.5, 'min_child_weight': 1, 'min_child_samples': 31, 'scale_pos_weight': 1}

    #kf = KFold(n_splits=5, shuffle=True, random_state=seed_val)
    pred_test_y = np.zeros(test_X.shape[0])

    train_set = lgbm.Dataset(train_X, train_y, silent=True)

    model = lgbm.train(params, train_set=train_set, num_boost_round=180)
    pred_test_y = model.predict(test_X, num_iteration = model.best_iteration)

    return pred_test_y , model

predictions, model = runLGBM(X_train, y_train, X_test, seed_val=42)

我该怎么做才能提高准确性？

模型拟合线性回归数据

0 个答案: