LightGBM错误-长度与数据不同

时间:2018-06-27 14:06:24

标签: machine-learning lightgbm

我正在使用lightGBM查找功能重要性,但是出现错误LightGBMError: b'len of label is not same with #data'。  形状     (73147,12)       形状     (73147,)

代码:

from sklearn.model_selection import train_test_split
import lightgbm as lgb

# Initialize an empty array to hold feature importances
feature_importances = np.zeros(X.shape[1])

# Create the model with several hyperparameters
model = lgb.LGBMClassifier(objective='binary', boosting_type = 'goss', n_estimators = 10000, class_weight = 'balanced')

# Fit the model twice to avoid overfitting
for i in range(2):

    # Split into training and validation set
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = i)

    # Train using early stopping
    model.fit(X, y_train, early_stopping_rounds=100, eval_set = [(X_test, y_test)], 
              eval_metric = 'auc', verbose = 200)

    # Record the feature importances
    feature_importances += model.feature_importances_

请参见下面的屏幕截图:

enter image description here

1 个答案:

答案 0 :(得分:1)

您的代码中似乎有错字;代替

model.fit(X, y_train, [...])

应该是

model.fit(X_train, y_train, [...])

现在,Xy_train的长度不相同是可以理解的,因此会出现错误。