有没有办法将自定义测试数据传递到XGBoost预测模型中?

时间:2019-12-12 03:44:50

标签: machine-learning scikit-learn xgboost

我已经拟合了训练和测试数据,现在我正尝试提供XGBoost模型定制数据。此数据的格式与我的训练和测试的格式相同,但是它是一维数组,而不是二维数组。我不断收到错误消息TypeError: can not initialize DMatrix from Series

这是我的代码:

#create test data series
newcol = ["away", "home"]
#enter names of teams to get
testrow = pd.Series(["Seoul Dynasty", "Dallas Fuel"], index=newcol)

#get all stats for each team
awayrow = get_team_stats(testrow[0])
homerow = get_team_stats(testrow[1])

#convert columns to proper team placement
awayrow = awayrow.rename(lambda x: "Away " + x)
homerow = homerow.rename(lambda x: "Home " + x)

#turn name into id
for name, team in ID_TO_NAME.items():
    if team == testrow[0]:
        testrow[0] = name
    if team == testrow[1]:
        testrow[1] = name
testrow = pd.concat([testrow, awayrow, homerow])
testrow = testrow[~testrow.index.str.contains("Rank")]

#predictions
rfcprediction = rfc.predict([testrow])
lrprediction = lr.predict([testrow])
knnprediction = knn.predict([testrow])
svprediction = sv.predict([testrow])

testrow = np.array([testrow]).reshap((1,-1))
xgprediction = xgboost.predict(testrow)

#turn id back into name
def convert_prediction(prediction):
    if prediction[0] == 1:
        #Home Won
        return ID_TO_NAME.get(testrow[1])
    if prediction[0] == 0:
        #Away Won
        return ID_TO_NAME.get(testrow[0])


print("Random Forest Prediction: ", convert_prediction(rfcprediction))
print(" ")
print("Logistic Regression Prediction: ", convert_prediction(lrprediction))
print(" ")
print("K Nearest Neighbors Prediction: ", convert_prediction(knnprediction))
print(" ")
print("SVC Prediction: ", convert_prediction(svprediction))
print(" ")

有人告诉我调整数组的形状(这是系列类型),但是这给了我一个feature_names不匹配的问题。如何为1行序列进行XGBoost预测?

0 个答案:

没有答案