我正在使用XGBoost使用虹膜数据集进行预测。该模型已经过训练,但是当我尝试进行新的预测时,出现以下错误:ValueError:feature_names mismatch。
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
iris = load_iris()
iris_df = pd.DataFrame(data= np.c_[iris['data'], iris['target']],
columns= iris['feature_names'] + ['target'])
iris_df.rename(columns={'sepal length (cm)':'sepal_length'}, inplace=True)
iris_df.rename(columns={'sepal width (cm)':'sepal_width'},inplace=True)
iris_df.rename(columns={'petal length (cm)':'petal_length'},inplace=True)
iris_df.rename(columns={'petal width (cm)':'petal_width'}, inplace=True)
data = iris_df[['sepal_length', 'sepal_width', 'petal_length', 'petal_width']]
target = iris_df[['targets']]
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(data, target, test_size=0.2, random_state=0)
import xgboost as xgb
train = xgb.DMatrix(X_train, label=y_train)
test = xgb.DMatrix(X_test, label=y_test)
param = {
'max_depth':4,
'eta':0.3,
'objective': 'multi:softmax',
'num_class': 3}
epochs = 10
model = xgb.train(param, train, epochs)
predictions = model.predict(test)
print(predictions)
from sklearn.metrics import accuracy_score
accuracy_score(y_test, predictions)
以上所有代码均有效,但是做出新的预测会引发错误:
testArray=np.array([[5.1,3.5,1.4,0.2]])
test_individual=xgb.DMatrix(testArray)
print(model.predict(test_individual))\
如何使新的预测有效?
答案 0 :(得分:0)
向xgb.DMatrix传递数据数组'testArray'和列标签的标签列表:
test_individual = xgb.Dmatrix(testArray, label = X_train.columns)
答案 1 :(得分:0)
向xgb.DMatrix传递一维数组:
testArray = np.array([5.1,3.5,1.4,0.2])
答案 2 :(得分:0)
解决了该问题。必须将数据框对象转换为数组:
train = iris_df[['sepal_length','sepal_width','petal_length','petal_width']].values
test = iris_df[['target']].values