我有一些自定义嵌入14gb和一个词汇表3gb。我创建了一个tf图来提取嵌入内容。如何确保在为该模型提供服务时,嵌入和词汇表都已加载到内存中,可以处理其余的API请求?
sentences = tf.placeholder(shape=[None], dtype=tf.string, name="sentences")
mapping_strings = tf.constant(vocabList)
table = tf.contrib.lookup.index_table_from_tensor(mapping=mapping_strings,
num_oov_buckets=1,
default_value=0
)
words = tf.string_split(sentences," ")
emd = table.lookup(words)
emd = tf.cast(emd,dtype=tf.int32)
dense_word_indices = tf.sparse.to_dense(emd)
dense_word_indices = tf.cast(dense_word_indices,dtype=tf.int32)
hashed_word_indices= tf.map_fn(add_n_grams,
dense_word_indices,
back_prop=False,
dtype=tf.int32
)
embedding_weights = tf.Variable(tf.constant(0.0, shape=[5002889, 700]),trainable=False, name="embedding_weights")
embedding_placeholder = tf.placeholder(tf.float32, [5002889,700])
embedding_init = embedding_weights.assign(embedding_placeholder)
embeddings = tf.nn.embedding_lookup(embedding_init,hashed_word_indices)
with tf.Session() as sess:
sess.run([tf.global_variables_initializer(), tf.tables_initializer()])
embedding = sess.run(embeddiings,
feed_dict={
sentences: test_sentences,
}
)
inputs = {
"sentences": sentences
}
outputs = {"sentence_embeddings": output}
export_path= "./"
tf.saved_model.simple_save(sess,
export_path,
inputs=inputs,
outputs=outputs,
legacy_init_op = tf.tables_initializer()
)
我可以使用docker加载模型,但出现以下错误:
{ "error": "You must feed a value for placeholder tensor \'Placeholder\' with dtype float and shape [5002889,700]\n\t [[{{node Placeholder}} = Placeholder[_output_shapes=[[5002889,700]], dtype=DT_FLOAT, shape=[5002889,700], _device=\"/job:localhost/replica:0/task:0/device:CPU:0\"]()]]" }
这是由于我没有通过占位符传递嵌入。是否可以在启动容器时将其加载或将其包含在原始模型中?