如何解决XGBoost错误“ feature_names不匹配”?

时间:2020-03-06 18:52:42

标签: python scikit-learn xgboost

我正在使用XGBoost使用虹膜数据集进行预测。该模型已经过训练,但是当我尝试进行新的预测时,出现以下错误:ValueError:feature_names mismatch。

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris

iris = load_iris()
iris_df = pd.DataFrame(data= np.c_[iris['data'], iris['target']],
                 columns= iris['feature_names'] + ['target'])

iris_df.rename(columns={'sepal length (cm)':'sepal_length'}, inplace=True)
iris_df.rename(columns={'sepal width (cm)':'sepal_width'},inplace=True)
iris_df.rename(columns={'petal length (cm)':'petal_length'},inplace=True)
iris_df.rename(columns={'petal width (cm)':'petal_width'}, inplace=True)

data = iris_df[['sepal_length', 'sepal_width', 'petal_length', 'petal_width']]
target = iris_df[['targets']]

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(data, target, test_size=0.2, random_state=0)

import xgboost as xgb

train = xgb.DMatrix(X_train, label=y_train)
test = xgb.DMatrix(X_test, label=y_test)

param = {
    'max_depth':4,
    'eta':0.3,
    'objective': 'multi:softmax',
    'num_class': 3}
epochs = 10

model = xgb.train(param, train, epochs)

predictions = model.predict(test)
print(predictions)

from sklearn.metrics import accuracy_score

accuracy_score(y_test, predictions)

以上所有代码均有效,但是做出新的预测会引发错误:

testArray=np.array([[5.1,3.5,1.4,0.2]])

test_individual=xgb.DMatrix(testArray)

print(model.predict(test_individual))\

如何使新的预测有效?

3 个答案:

答案 0 :(得分:0)

向xgb.DMatrix传递数据数组'testArray'和列标签的标签列表:

test_individual = xgb.Dmatrix(testArray, label = X_train.columns)

答案 1 :(得分:0)

向xgb.DMatrix传递一维数组:

testArray = np.array([5.1,3.5,1.4,0.2])

答案 2 :(得分:0)

解决了该问题。必须将数据框对象转换为数组:


train = iris_df[['sepal_length','sepal_width','petal_length','petal_width']].values
test = iris_df[['target']].values