我已经建立了一个scikit-learn管道,该管道使用LSTM Keras模型(用keras.wrappers.scikit_learn.KerasClassifier
包装)作为最后一个管道步骤。管道完成训练后,我将整个管道保存到磁盘(请参见下文)。我在将管道加载回内存并进行预测时遇到了麻烦。 scikit-learn管道和Keras模型目前似乎无法很好地配合使用,这使事情变得棘手。有人有经验吗?
张量流:2.3.1 keras:2.4.3 scikit学习:0.23.2
代码:
import pandas as pd
from model_lstm.config import config
import joblib
import keras
from keras.wrappers.scikit_learn import KerasClassifier
from model_lstm.utils import data_management as dm
def save_fitted_pipeline(pipeline):
model_path = config.TRAINED_MODEL_DIR / config.TRAINED_MODEL_FILE
pipeline_path = config.TRAINED_MODEL_DIR / config.TRAINED_PIPELINE_FILE
pipeline.named_steps["lstm_model"].model.save(model_path)
pipeline.named_steps["lstm_model"].model = None
joblib.dump(pipeline, pipeline_path)
def load_fitted_pipeline():
model_path = config.TRAINED_MODEL_DIR / config.TRAINED_MODEL_FILE
pipeline_path = config.TRAINED_MODEL_DIR / config.TRAINED_PIPELINE_FILE
pipeline = joblib.load(pipeline_path)
model_func = lambda: keras.models.load_model(model_path)
wrapped_model = KerasClassifier(build_fn=model_func)
pipeline.named_steps["lstm_model"] = wrapped_model
pipeline.named_steps["lstm_model"].model = keras.models.load_model(model_path)
return pipeline
def predict():
lstm_pipeline = load_fitted_pipeline()
data_path = config.DATA_DIR / config.TRAINING_DATA_FILE
X_train, y_train = dm.load_data(data_path)
pred = lstm_pipeline.predict(X_train)
当前错误:
../model_lstm/predict.py:8: in predict
pred = lstm_pipeline.predict(X_train)
../../../../anaconda/envs/sa_model_lstm/lib/python3.7/site-packages/sklearn/utils/metaestimators.py:119: in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
../../../../anaconda/envs/sa_model_lstm/lib/python3.7/site-packages/sklearn/pipeline.py:408: in predict
return self.steps[-1][-1].predict(Xt, **predict_params)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <tensorflow.python.keras.wrappers.scikit_learn.KerasClassifier object at 0x1a4256f690>
x = array([[ 0, 0, 0, ..., 125, 309, 310],
[ 0, 0, 0, ..., 19, 3, 312],
[ 0, 0...076],
[ 0, 0, 0, ..., 2, 1077, 13],
[ 0, 0, 0, ..., 1080, 160, 1081]], dtype=int32)
kwargs = {'batch_size': 128, 'verbose': 1}
def predict(self, x, **kwargs):
"""Returns the class predictions for the given test data.
Arguments:
x: array-like, shape `(n_samples, n_features)`
Test samples where `n_samples` is the number of samples
and `n_features` is the number of features.
**kwargs: dictionary arguments
Legal arguments are the arguments
of `Sequential.predict_classes`.
Returns:
preds: array-like, shape `(n_samples,)`
Class predictions.
"""
kwargs = self.filter_sk_params(Sequential.predict_classes, kwargs)
> classes = self.model.predict_classes(x, **kwargs)
E AttributeError: 'Functional' object has no attribute 'predict_classes'
../../../../anaconda/envs/sa_model_lstm/lib/python3.7/site-packages/tensorflow/python/keras/wrappers/scikit_learn.py:241: AttributeError
答案 0 :(得分:0)
这是我的问题。对于那些感兴趣的人,我已经设法自己解决了。事实证明,该问题实际上与写入然后读取到磁盘的管道无关。仅当您的Keras模型是keras.wrappers.scikit_learn.KerasClassifier
而不是Sequential
的实例时,Model
包装器才能正常工作。我将模型转换为Sequential
,一切正常。实际上,保存和加载逻辑变得比上面的代码中显示的要简单。