Question

我使用以下代码在python中训练了一个模型（我没有使用此示例的测试集，我正在使用相同的数据集进行训练和预测，以便更容易地说明问题）：

params = {'learning_rate':0.1,'obj':'binary:logistic','n_estimators':250, 'scale_pos_weight':0.2, 'max_depth' : 15, 'min_weight' : 1, 'colsample_bytree' : 1, 'gamma' : 0.1, 'subsample':0.95} 

X = np.array(trainingData,dtype = np.uint32) #training data was generated from a csv

X = xgb.DMatrix(np.asmatrix(X), label = Y)

clf = xgb.train(params, X)
clf.save_model('xgb_test.model')
X.save_binary('test.buffer')

answer = clf.predict(X)

预测产生大约40k零和270k

然后使用以下代码将模型加载到c ++中：

const char * fileName = "blahblah/xgb_test.model";
int x = XGBoosterLoadModel(handle, fileName);
if (x == 0) {
    printf("Successfully Loaded Model\n");
}


const char * predictionData = "blahblah/test.buffer";
x = XGDMatrixCreateFromFile(predictionData, 0, &dHandle);


if (x == 0) {
    printf("Successfully Loaded Data\n");
}

bst_ulong  out2;
const float *m_TestResults2;
x = XGBoosterPredict(handle, dHandle, 0, 1, &out2, &m_TestResults2);
if (x == 0) {
    printf("Successful Prediction\n");
}

int zeroCount = 0;
int oneCount = 0;

for (int i = 0; i < out2; i++) {
    if (m_TestResults2[i] < 0.5) {
        zeroCount++;
    }
    else {
        oneCount++;
    }
}
printf("Number of Zeroes: " + zeroCount);
printf("Number of Ones: " + oneCount);

对于c ++预测，我获得了大约55k的零。

我尝试了以下内容：

确保使用密集矩阵在python中训练模型，因为xgboosterpredict采用密集矩阵（从堆栈溢出的类似问题得出此假设）
使用xgb.train，而不是Xgbclassifier.fit。火车进入dmatrix，适合不
使用np.asmatrix（X）将训练数据转换为np矩阵。

任何人都有任何想法我做错了什么？感谢

对于相同的模型，Xgboost预测对于C ++和Python是不同的

0 个答案: