我已将keras-tf模型包装到Sklearn管道中,该管道也进行一些预处理和后处理。我想序列化此模型并通过MLflow捕获其依赖性。
我尝试了mlflow.keras.save_model()
,这似乎不合适。 (它不是“纯” keras模型,也不具有save()
属性)
我还尝试了mlflow.sklearn.save_model()
和mlflow.pyfunc.save_model()
,它们都导致相同的错误:
NotImplementedError: numpy() is only available when eager execution is enabled.
(这个错误似乎是由python和tensorflow之间的冲突引起的吗?)
我想知道,是否应该/通常可以使用mlflow序列化此类“混合”模型?
# In[1]:
from mlflow.sklearn import save_model
import mlflow.sklearn
from sklearn.datasets import load_iris
from sklearn import tree
from tensorflow.keras.models import Sequential
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import Adam
# ### Save Keras Model
# In[2]:
iris_data = load_iris()
x = iris_data.data
y_ = iris_data.target.reshape(-1, 1)
# One Hot encode the class labels
encoder = OneHotEncoder(sparse=False)
y = encoder.fit_transform(y_)
# Split the data for training and testing
train_x, test_x, train_y, test_y = train_test_split(x, y, test_size=0.20)
# Build the model
model = Sequential()
model.add(Dense(10, input_shape=(4,), activation='relu', name='fc1'))
model.add(Dense(10, activation='relu', name='fc2'))
model.add(Dense(3, activation='softmax', name='output'))
optimizer = Adam(lr=0.001)
model.compile(optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
# Train the model
model.fit(train_x, train_y, verbose=2, batch_size=5, epochs=20)
# In[3]:
import mlflow.keras
mlflow.keras.save_model(model, "modelstorage/model40")
# ### Save Minimal SKlearn-Pipeline (with Keras)
# In[4]:
from category_encoders.target_encoder import TargetEncoder
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from keras.wrappers.scikit_learn import KerasClassifier
# In[5]:
def define_model():
"""
Create fully connected network with given parameters.
"""
keras_model = Sequential()
keras_model.add(Dense(10, input_shape=(4,), activation='relu', name='fc1'))
keras_model.add(Dense(10, activation='relu', name='fc2'))
keras_model.add(Dense(3, activation='softmax', name='output'))
optimizer = Adam(lr=0.001)
keras_model.compile(optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
return model
# In[6]:
# target_encoder = TargetEncoder()
scaler = StandardScaler()
keras_model = KerasClassifier(define_model, batch_size=5, epochs=20)
# In[7]:
pipeline = Pipeline([
# ('encoding', target_encoder),
('scaling', scaler),
('modeling', keras_model)
])
# In[8]:
pipeline.fit(train_x, train_y)
# In[9]:
mlflow.keras.save_model(pipeline, "modelstorage/model42") #not working
# In[10]:
import mlflow.sklearn
mlflow.sklearn.save_model(pipeline, "modelstorage/model43")
Output from modelstorage/model43/conda.yaml:
======================
channels:
- defaults
dependencies:
- python=3.6.7
- scikit-learn=0.21.2
- pip:
- mlflow
- cloudpickle==1.2.1
name: mlflow-env
======================
Doesn't seem to capture Tensorflow.
答案 0 :(得分:1)
保存模型时可以添加额外的依赖项,例如,如果管道中有keras步骤,则可以添加keras和tensorflow:
conda_env = mlflow.sklearn.get_default_conda_env()
conda_env["dependencies"] = ['keras==2.2.4', 'tensorflow==1.14.0'] + conda_env["dependencies"]
mlflow.sklearn.log_model(pipeline, "modelstorage/model43", conda_env = conda_env)