保存和加载Keras模型

时间:2019-04-01 12:49:11

标签: python keras deep-learning

我正在研究使用通用句子嵌入对提供的句子进行编码的Keras模型。但是,当我保存模型以备将来使用时,会抛出上述错误。 NameError: name 'embed' is not defined

使用UniversalEmbedding(x)函数将句子转换为嵌入。 整个模型的代码均来自此here

!wget https://raw.githubusercontent.com/Tony607/Keras-Text-Transfer-Learning/master/train_5500.txt
!wget https://raw.githubusercontent.com/Tony607/Keras-Text-Transfer-Learning/master/test_data.txt

import tensorflow as tf
import tensorflow_hub as hub
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd
import re
import seaborn as sns
import keras.layers as layers
from keras.models import Model
from keras import backend as K
np.random.seed(10)

def get_dataframe(filename):
    lines = open(filename, 'r').read().splitlines()
    data = []
    for i in range(0, len(lines)):
        label = lines[i].split(' ')[0]
        label = label.split(":")[0]
        text = ' '.join(lines[i].split(' ')[1:])
        text = re.sub('[^A-Za-z0-9 ,\?\'\"-._\+\!/\`@=;:]+', '', text)
        data.append([label, text])

    df = pd.DataFrame(data, columns=['label', 'text'])
    df.label = df.label.astype('category')
    return df

df_train = get_dataframe('train_5500.txt')
df_train = get_dataframe('test_data.txt')

category_counts = len(df_train.label.cat.categories)
module_url = "https://tfhub.dev/google/universal-sentence-encoder-large/3" 
embed = hub.Module(module_url)
embed_size = embed.get_output_info_dict()['default'].get_shape()[1].value

def UniversalEmbedding(x):
    return embed(tf.squeeze(tf.cast(x, tf.string)), signature="default", as_dict=True)["default"]

input_text = layers.Input(shape=(1,), dtype='string')
embedding = layers.Lambda(UniversalEmbedding, output_shape=(embed_size,))(input_text)
dense = layers.Dense(256, activation='relu')(embedding)
pred = layers.Dense(category_counts, activation='softmax')(dense)
model = Model(inputs=[input_text], outputs=pred)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

train_text = df_train['text'].tolist()
train_text = np.array(train_text, dtype=object)[:, np.newaxis]

train_label = np.asarray(pd.get_dummies(df_train.label), dtype = np.int8)

df_test = get_dataframe('test_data.txt')
test_text = df_test['text'].tolist()
test_text = np.array(test_text, dtype=object)[:, np.newaxis]
test_label = np.asarray(pd.get_dummies(df_test.label), dtype = np.int8)


with tf.Session() as session:
  K.set_session(session)
  session.run(tf.global_variables_initializer())
  session.run(tf.tables_initializer())
  history = model.fit(train_text, 
            train_label,
            validation_data=(test_text, test_label),
            epochs=2,
            batch_size=32)
  model.save_weights('./model.h5')
  model.save('mod.h5')

当我尝试加载模型时

from keras.models import load_model

load_model('mod.h5') 

1 个答案:

答案 0 :(得分:0)

当您尝试使用keras的load_model加载模型时,该方法会显示错误,因为embed并不是keras内置的,要解决此问题,您可能必须在代码中再次定义它,然后再加载使用load_model建立模型。

在尝试加载模型之前,请尝试在提供的链接(https://www.dlology.com/blog/keras-meets-universal-sentence-encoder-transfer-learning-for-text-data/)中编写embed = hub.Module(module_url)及其必需的库和URL。