Auto-sklearn:如何加载泡菜文件并运行predict()

时间:2020-05-09 20:21:20

标签: python automl

我已经使用auto-sklearn拟合了分类模型,并设法将它保存到pickle中。

x = automl.show_models()
results = {"ensemble": x}
pickle.dump(results, open('file.pickle','wb'))

我也设法重新加载了模型。

automl = pickle.load(open('file.pickle','rb'))

但是我无法使用重新加载的模型对新数据进行预测。 当我跑步时:

y_hat = automl.predict(X_test)

我收到以下错误:

AttributeError: 'str' object has no attribute 'predict'

2 个答案:

答案 0 :(得分:1)

我认为您应该转储automl.show_models()对象,而不是转储automl的输出,因为这是唯一具有predict方法的类。

答案 1 :(得分:1)

不正确:

x = automl.show_models()
results = {"ensemble": x} # <---
pickle.dump(results, open('file.pickle','wb'))

正确:

x = automl.show_models()
#results = {"ensemble": x}
results = autml # the classifier/regressor itself
pickle.dump(results, open('file.pickle','wb'))

鸢尾花的示例代码:

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from autosklearn.classification import AutoSklearnClassifier
import pickle


# dataset:
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2)

# train model:
classifier = AutoSklearnClassifier(
    time_left_for_this_task=30, 
    per_run_time_limit=60,
    memory_limit=1024*12) # depends on your computer
classifier.fit(X_train, y_train)

# save model
with open('iris-classifier.pkl', 'wb') as f:
    pickle.dump(classifier, f)

# load model
with open('iris-classifier.pkl', 'rb') as f:
    loaded_classifier = pickle.load(f)

# predict
y_true = y_test
y_pred = loaded_classifier.predict(X_test)
print('iris classifier: accuracy:', accuracy_score(y_true, y_pred))
# iris classifier: accuracy: 0.9333333333333333