将Keras模型部署到Google Cloud ML以进行预测

时间:2017-08-22 03:47:59

标签: deployment tensorflow google-cloud-platform keras google-cloud-ml

我需要了解如何在Google Cloud ML上部署模型。我的第一个任务是在服务上部署一个非常简单的文本分类器。我是按照以下步骤进行的(也许可以缩短到更少的步骤,如果是这样,请随时告诉我):

  1. 使用Keras定义模型并导出到YAML
  2. 加载YAML并导出为Tensorflow SavedModel
  3. 将模型上传到Google云端存储
  4. 将模型从存储部署到Google Cloud ML
  5. 在模型网站上将上传模型版本设置为默认值。
  6. 使用示例输入运行模型
  7. 我终于完成了第1-5步,但是现在我在运行模型时遇到了这个奇怪的错误。有人可以帮忙吗?有关步骤的详细信息如下。希望它也可以帮助其他人坚持前面的步骤之一。我的模型在本地工作正常。

    我见过Deploying Keras Models via Google Cloud MLExport a basic Tensorflow model to Google Cloud ML,但他们似乎仍然坚持这个过程的其他步骤。

    错误

    Prediction failed: Exception during model execution: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="In[0] is not a matrix
             [[Node: MatMul = MatMul[T=DT_FLOAT, _output_shapes=[[-1,64]], transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/cpu:0"](Mean, softmax_W/read)]]")
    

    第1步

    # import necessary classes from Keras..
    model_input = Input(shape=(maxlen,), dtype='int32')
    embed = Embedding(input_dim=nb_tokens,
                      output_dim=256,
                      mask_zero=False,
                      input_length=maxlen,
                      name='embedding')
    x = embed(model_input)
    x = GlobalAveragePooling1D()(x)
    outputs = [Dense(nb_classes, activation='softmax', name='softmax')(x)]
    model = Model(input=[model_input], output=outputs, name="fasttext")
    # export to YAML..
    

    第2步

    from __future__ import print_function
    
    import sys
    import os
    
    import tensorflow as tf
    from tensorflow.contrib.session_bundle import exporter
    import keras
    from keras import backend as K
    from keras.models import model_from_config, model_from_yaml
    from optparse import OptionParser
    
    EXPORT_VERSION = 1 # for us to keep track of different model versions (integer)
    
    def export_model(model_def, model_weights, export_path):
    
        with tf.Session() as sess:
            init_op = tf.global_variables_initializer()
            sess.run(init_op)
    
            K.set_learning_phase(0)  # all new operations will be in test mode from now on
    
            yaml_file = open(model_def, 'r')
            yaml_string = yaml_file.read()
            yaml_file.close()
    
            model = model_from_yaml(yaml_string)
    
            # force initialization
            model.compile(loss='categorical_crossentropy',
                          optimizer='adam') 
            Wsave = model.get_weights()
            model.set_weights(Wsave)
    
            # weights are not loaded as I'm just testing, not really deploying
            # model.load_weights(model_weights)   
    
            print(model.input)
            print(model.output)
    
            pred_node_names = output_node_names = 'Softmax:0'
            num_output = 1
    
            export_path_base = export_path
            export_path = os.path.join(
                tf.compat.as_bytes(export_path_base),
                tf.compat.as_bytes('initial'))
            builder = tf.saved_model.builder.SavedModelBuilder(export_path)
    
            # Build the signature_def_map.
            x = model.input
            y = model.output
    
            values, indices = tf.nn.top_k(y, 5)
            table = tf.contrib.lookup.index_to_string_table_from_tensor(tf.constant([str(i) for i in xrange(5)]))
            prediction_classes = table.lookup(tf.to_int64(indices))
    
            classification_inputs = tf.saved_model.utils.build_tensor_info(model.input)
            classification_outputs_classes = tf.saved_model.utils.build_tensor_info(prediction_classes)
            classification_outputs_scores = tf.saved_model.utils.build_tensor_info(values)
            classification_signature = (
            tf.saved_model.signature_def_utils.build_signature_def(inputs={tf.saved_model.signature_constants.CLASSIFY_INPUTS: classification_inputs},
              outputs={tf.saved_model.signature_constants.CLASSIFY_OUTPUT_CLASSES: classification_outputs_classes, tf.saved_model.signature_constants.CLASSIFY_OUTPUT_SCORES: classification_outputs_scores},
              method_name=tf.saved_model.signature_constants.CLASSIFY_METHOD_NAME))
    
            tensor_info_x = tf.saved_model.utils.build_tensor_info(x)
            tensor_info_y = tf.saved_model.utils.build_tensor_info(y)
    
            prediction_signature = (tf.saved_model.signature_def_utils.build_signature_def(
                inputs={'images': tensor_info_x},
                outputs={'scores': tensor_info_y},
                method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME))
    
            legacy_init_op = tf.group(tf.tables_initializer(), name='legacy_init_op')
            builder.add_meta_graph_and_variables(
                sess, [tf.saved_model.tag_constants.SERVING],
                signature_def_map={'predict_images': prediction_signature,
                   tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: classification_signature,},
                legacy_init_op=legacy_init_op)
    
            builder.save()
            print('Done exporting!')
    
            raise SystemExit
    
    if __name__ == '__main__':
        usage = "usage: %prog [options] arg"
        parser = OptionParser(usage)
        (options, args) = parser.parse_args()
    
        if len(args) < 3:   
            raise ValueError("Too few arguments!")
    
        model_def = args[0]
        model_weights = args[1]
        export_path = args[2]
        export_model(model_def, model_weights, export_path)
    

    第3步

    gsutil cp -r fasttext_cloud/ gs://quiet-notch-xyz.appspot.com

    第4步

    from __future__ import print_function
    
    from oauth2client.client import GoogleCredentials
    from googleapiclient import discovery
    from googleapiclient import errors
    import time
    
    projectID = 'projects/{}'.format('quiet-notch-xyz')
    modelName = 'fasttext'
    modelID = '{}/models/{}'.format(projectID, modelName)
    versionName = 'Initial'
    versionDescription = 'Initial release.'
    trainedModelLocation = 'gs://quiet-notch-xyz.appspot.com/fasttext/'
    
    credentials = GoogleCredentials.get_application_default()
    ml = discovery.build('ml', 'v1', credentials=credentials)
    
    # Create a dictionary with the fields from the request body.
    requestDict = {'name': modelName, 'description': 'Online predictions.'}
    
    # Create a request to call projects.models.create.
    request = ml.projects().models().create(parent=projectID, body=requestDict)
    
    # Make the call.
    try:
        response = request.execute()
    except errors.HttpError as err: 
        # Something went wrong, print out some information.
        print('There was an error creating the model.' +
            ' Check the details:')
        print(err._get_reason())
    
        # Clear the response for next time.
        response = None
        raise
    
    
    time.sleep(10)
    
    requestDict = {'name': versionName,
                   'description': versionDescription,
                   'deploymentUri': trainedModelLocation}
    
    # Create a request to call projects.models.versions.create
    request = ml.projects().models().versions().create(parent=modelID,
                  body=requestDict)
    
    # Make the call.
    try:
        print("Creating model setup..", end=' ')
        response = request.execute()
    
        # Get the operation name.
        operationID = response['name']
        print('Done.')
    
    except errors.HttpError as err:
        # Something went wrong, print out some information.
        print('There was an error creating the version.' +
              ' Check the details:')
        print(err._get_reason())
        raise
    
    done = False
    request = ml.projects().operations().get(name=operationID)
    print("Adding model from storage..", end=' ')
    
    while (not done):
        response = None
    
        # Wait for 10000 milliseconds.
        time.sleep(10)
    
        # Make the next call.
        try:
            response = request.execute()
    
            # Check for finish.
            done = True # response.get('done', False)
    
        except errors.HttpError as err:
            # Something went wrong, print out some information.
            print('There was an error getting the operation.' +
                  'Check the details:')
            print(err._get_reason())
            done = True
            raise
    
    print("Done.")
    

    第5步

    使用网站。

    第6步

    def predict_json(instances, project='quiet-notch-xyz', model='fasttext', version=None):
        """Send json data to a deployed model for prediction.
    
        Args:
            project (str): project where the Cloud ML Engine Model is deployed.
            model (str): model name.
            instances ([Mapping[str: Any]]): Keys should be the names of Tensors
                your deployed model expects as inputs. Values should be datatypes
                convertible to Tensors, or (potentially nested) lists of datatypes
                convertible to tensors.
            version: str, version of the model to target.
        Returns:
            Mapping[str: any]: dictionary of prediction results defined by the
                model.
        """
        # Create the ML Engine service object.
        # To authenticate set the environment variable
        # GOOGLE_APPLICATION_CREDENTIALS=<path_to_service_account_file>
        service = googleapiclient.discovery.build('ml', 'v1')
        name = 'projects/{}/models/{}'.format(project, model)
    
        if version is not None:
            name += '/versions/{}'.format(version)
    
        response = service.projects().predict(
            name=name,
            body={'instances': instances}
        ).execute()
    
        if 'error' in response:
            raise RuntimeError(response['error'])
    
        return response['predictions']
    

    然后使用测试输入运行函数:predict_json({'inputs':[[18, 87, 13, 589, 0]]})

1 个答案:

答案 0 :(得分:2)

现在有一个示例演示了在CloudML引擎上使用Keras,包括预测。你可以在这里找到样本:

https://github.com/GoogleCloudPlatform/cloudml-samples/tree/master/census/keras

我建议将代码与代码进行比较。

一些仍然相关的其他建议:

CloudML Engine目前仅支持使用单个签名(默认签名)。看看你的代码,我认为prediction_signature更有可能带来成功,但你还没有成为默认签名。我建议如下:

builder.add_meta_graph_and_variables(
            sess, [tf.saved_model.tag_constants.SERVING],
            signature_def_map={tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: prediction_signature,},
            legacy_init_op=legacy_init_op)

如果要部署到该服务,那么您将调用预测:

predict_json({'images':[[18, 87, 13, 589, 0]]})

如果使用gcloud ml-engine local predict --json-instances进行本地测试,则输入数据会略有不同(与批量预测服务的数据匹配)。每个换行符分隔的行如下所示(显示包含两行的文件):

{'images':[[18, 87, 13, 589, 0]]}
{'images':[[21, 85, 13, 100, 1]]}

我实际上对model.x的形状知之甚少,以确保发送的数据对于您的模型是正确的。

通过解释,考虑ClassificationPredictionSavedModel方法之间的差异可能很有见地。一个区别是,当使用基于tensorflow_serving的{​​{1}}(强类型)时,gRPC提供了大多数分类器可以使用的强类型签名。然后,您可以在任何分类器上重用相同的客户端。

使用JSON时没有太大用处,因为JSON不是强类型的。

另一个不同之处在于,在使用Classification时,tensorflow_serving接受基于列的输入(从要素名称到整个批次中该要素的每个值的映射),而Prediction接受基于行的输入(每个输入实例/示例都是一行)。

CloudML抽象了一点,总是需要基于行的输入(实例列表)。我们即使我们只正式支持Classification,但Prediction也可以正常使用。