多标签预测

时间:2019-08-22 10:29:58

标签: python pandas machine-learning multilabel-classification

我试图在多标签数据集(4个类别标签)上运行预测,但在尝试为每个类别进行预测时却遇到此错误。 ValueError: could not broadcast input array from shape (2365,4) into shape (2365) 这是零矩阵的代码片段,形状为我要返回的

preds = np.zeros((test.shape[0], len(cols)))
preds.shape

以上返回形状(2365,4)

这是我要进行预测并给出上述错误的部分。

for i, j in enumerate(cols):
    print('fitting column : '+ j)
    # making train and validation sets
    X_train, X_val, y_train, y_val = train_test_split(x,  y[j], test_size = 0.2,  random_state = 42)
    X_train = xgb.DMatrix(X_train, label=y_train)
    X_val = xgb.DMatrix(X_val, label=y_val)
    y_train = xgb.DMatrix(y_train)

    model = xgb.train(params, X_train, num_rounds)
    train_pred = model.predict(X_train)

    #Evaluating model performance
    val_pred = model.predict(X_val)
    # making prediciton for each column
    print ('predicting for:' +j)
    #print("Training score:{} and Validation score: {}".format(log_loss(y_train, train_pred), log_loss(y_val, val_pred)))

    # Making predictions for the test-set
    preds[:,i] = model.predict(xgb.DMatrix(test))
print('Fininshed Training')

感谢您的帮助。

0 个答案:

没有答案