Question

我正在尝试使用tensorflow hub中的embeddings模块作为可服务。我是tensorflow的新手。目前，我使用Universal Sentence Encoder嵌入作为查找将句子转换为嵌入，然后使用这些嵌入来查找与另一个句子的相似性。

我目前将句子转换为嵌入的代码是：

with tf.Session() as session:
  session.run([tf.global_variables_initializer(), tf.tables_initializer()])
  sen_embeddings = session.run(self.embed(prepared_text))

Prepared_text是一个句子列表。如何使用此模型并使其成为可维护的？

Answer 1

现在，您可能需要手动执行此操作。这是我的解决方案，类似于先前的答案，但更通用-显示如何在不猜测输入参数的情况下使用任何其他模块，以及通过验证和用法进行扩展：

import tensorflow as tf
import tensorflow_hub as hub
from tensorflow.saved_model import simple_save

export_dir = "/tmp/tfserving/universal_encoder/00000001"
with tf.Session(graph=tf.Graph()) as sess:
    module = hub.Module("https://tfhub.dev/google/universal-sentence-encoder/2") 
    input_params = module.get_input_info_dict()
    # take a look at what tensor does the model accepts - 'text' is input tensor name

    text_input = tf.placeholder(name='text', dtype=input_params['text'].dtype, 
        shape=input_params['text'].get_shape())
    sess.run([tf.global_variables_initializer(), tf.tables_initializer()])

    embeddings = module(text_input)

    simple_save(sess,
        export_dir,
        inputs={'text': text_input},
        outputs={'embeddings': embeddings},
        legacy_init_op=tf.tables_initializer())

感谢module.get_input_info_dict()，您知道需要传递给模型的张量名称-您可以在inputs={}方法中将此名称用作simple_save的键。

请记住，要提供模型，它必须位于以version结尾的目录路径中，这就是'00000001'是saved_model.pb所在的最后路径的原因。

导出模块后，查看模型是否正确导出以供服务的最快方法是使用saved_model_cli API：

saved_model_cli run --dir /tmp/tfserving/universal_encoder/00000001 --tag_set serve --signature_def serving_default --input_exprs 'text=["what this is"]'

要从docker服务模型：

docker pull tensorflow/serving  
docker run -p 8501:8501 -v /tmp/tfserving/universal_encoder:/models/universal_encoder -e MODEL_NAME=universal_encoder -t tensorflow/serving

Answer 2

当前，Tensorflow Serving无法直接使用集线器模块。您将必须将该模块加载到一个空图表中，然后使用SavedModelBuilder导出该模块。例如：

import tensorflow as tf
import tensorflow_hub as hub

with tf.Graph().as_default():
  module = hub.Module("http://tfhub.dev/google/universal-sentence-encoder/2")
  text = tf.placeholder(tf.string, [None])
  embedding = module(text)

  init_op = tf.group([tf.global_variables_initializer(), tf.tables_initializer()])
  with tf.Session() as session:
    session.run(init_op)
    tf.saved_model.simple_save(
        session,
        "/tmp/serving_saved_model",
        inputs = {"text": text},
        outputs = {"embedding": embedding},
        legacy_init_op = tf.tables_initializer()        
    )

这将以所需的服务格式导出模型（至文件夹/tmp/serving_saved_model）。之后，您可以按照文档中的说明进行操作：https://www.tensorflow.org/serving/serving_basic

Answer 3

请注意，其他答案适用于 TensorFlow 1。TensorFlow 2 的大多数 TF Hub 模型已经与 TF Serving 兼容。例如，要部署 USE-Large 模型：

通过 tensorflow_hub 库或仅 https://tfhub.dev/google/universal-sentence-encoder-large/5 下载模型
将内容放入代表模型名称和版本的文件夹中，例如models/use-large/5
运行 TF Serving 应用程序，例如通过 Docker：

docker run -t --rm -p 8501:8501 \
   -v "$PATH_TO_YOUR_WORKSPACE/models:/models" \
   -e MODEL_NAME="use-large" \
   tensorflow/serving

该模型将于 localhost:8501/v1/models/use-large 发售：

curl -d '{"instances": ["Hey!"]}' \
    -X POST http://localhost:8501/v1/models/use-large:predict

如何使用tensorflow服务使tensorflow集线器嵌入可用？

3 个答案: