Question

我正在尝试在AI平台上编写自定义ML预测例程，以从客户端获取文本数据，进行一些自定义预处理，将其传递到模型中，然后运行模型。我能够成功地将此代码打包并部署到Google云上。但是，每次尝试从node.js向其发送请求时，我都会返回data: { error: 'Prediction failed: unknown error.' },。

这是我相关的自定义预测例程代码。请注意，我在客户端的文本中设置了instances，然后在自定义预测例程中对其进行标记化和预处理。

def __init__(self, model, session, saver, dictionary):
    self.model = model
    self.sess = session

@classmethod
def from_path(cls, model_dir):
    m = Model(learning_rate=0.1)
    session = tf.Session()
    session.run(tf.global_variables_initializer())
    session.run(tf.local_variables_initializer())
    saver = tf.train.Saver(max_to_keep=0)
    saver.restore(session, (os.path.join(model_dir, 'model.ckpt')))
    return cls(m, session)

def predict(self, instances, **kwargs):
    utterance = nltk.word_tokenize(instances)
    utterance = self.preprocess_utterance(utterance)

    preds = self.sess.run([self.model['preds'], feed_dict={'input_data': utterance)
    return preds

这是我的Node.js代码：

   text_string = "Hello how are you?"
   google.auth.getApplicationDefault(function (err, authClient, projectId) {
        if (err) {
            console.log('Authentication failed because of ', err);
            return;
        }
        if (authClient.createScopedRequired && authClient.createScopedRequired()) {
            var scopes = ['https://www.googleapis.com/auth/cloud-platform'];
            authClient = authClient.createScoped(scopes);
        }
        var request = {
            name: "projects/" + projectId + "/models/classifier",
            resource: {"instances": [message_string]},

            // This is a "request-level" option
            auth: authClient
        };

        machinelearning.projects.predict(request, function (err, result) {

            console.log(result)

            if (err) {
                console.log(err);
            } else {
                console.log(result);
                res.status(200).send('Hello, world! This is the prediction: ' + JSON.stringify(result)).end();
            }
        });
    });

在这段代码中，我只是将文本发送到Google Cloud模型。请求正文为： body: '{"instances":["Hello how are you?"]}',

有人知道为什么它会失败吗？

如果没有，那么有人对我如何调试它有所了解吗？未知错误消息根本没有用。

编辑：

这是saved_model_cli中带有--all选项的输出。

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['length_input'] tensor_info:
        dtype: DT_INT32
        shape: ()
        name: Placeholder_3:0
    inputs['seqlen'] tensor_info:
        dtype: DT_INT32
        shape: (-1)
        name: Placeholder_2:0
    inputs['indicator'] tensor_info:
        dtype: DT_INT32
        shape: (-1, 2)
        name: Placeholder_1:0
    inputs['input_data'] tensor_info:
        dtype: DT_INT32
        shape: (-1, -1)
        name: Placeholder:0
    inputs['y'] tensor_info:
        dtype: DT_INT32
        shape: (-1, -1)
        name: Placeholder_4:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['preds'] tensor_info:
        dtype: DT_INT32
        shape: (-1, -1)
        name: Cast:0
  Method name is: tensorflow/serving/predict

基于此，我应该提供此字典作为输入，但是它不起作用。

{"instances": [ { "input_data": [138, 30, 66], "length_input": 1, "indicator": [[0, 0]], "seqlen": [3], "y": [138, 30, 66] } ]}

Answer 1

我发现了问题。问题不在于输入数据的格式。而是在NLTK中。 NLTK.word_tokenize抛出错误，因为它没有进行标记化所需的数据。我必须将数据上传到Google Cloud或使用不需要任何数据文件的令牌化方法来解决此问题。

我不知道为什么这个Google Cloud自定义预测例行软件不会告诉用户正在发生的错误，但是通过我的所有努力，只要出现问题，它总是只返回Unknown error。如果我确切地知道错误是什么，那将是一个简单的解决方法。

Answer 2

我认为您需要：

{instances: [
 {"input_data": "hello, how are you?"},
 {"input_data": "who is this?"}
]}

但是我们可以确认是否可以查看对SavedModel文件调用save_model_cli的结果。

向Google Cloud ML自定义预测例程发送数据时发生未知错误

2 个答案: