Question

我使用Keras预测图像类。它适用于Google Cloud ML（GCML），但为了提高效率，需要将其更改为传递base64字符串而不是json数组。 Related Documentation

我可以轻松运行python代码将base64字符串解码为json数组，但是当使用GCML时，我没有机会运行预处理步骤（除非在Keras中使用Lambda层，但我没有＆＃ 39;认为这是正确的方法。）

Another answer建议添加tf.placeholder类型为tf.string，这是有道理的，但如何将其纳入Keras模型？

以下是培训模型和保存GCML导出模型的完整代码...

import os
import numpy as np
import tensorflow as tf
import keras
from keras import backend as K
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.preprocessing import image
from tensorflow.python.platform import gfile

IMAGE_HEIGHT = 138
IMAGE_WIDTH = 106
NUM_CLASSES = 329

def preprocess(filename):
    # decode the image file starting from the filename
    # end up with pixel values that are in the -1, 1 range
    image_contents = tf.read_file(filename)
    image = tf.image.decode_png(image_contents, channels=1)
    image = tf.image.convert_image_dtype(image, dtype=tf.float32) # 0-1
    image = tf.expand_dims(image, 0) # resize_bilinear needs batches
    image = tf.image.resize_bilinear(image, [IMAGE_HEIGHT, IMAGE_WIDTH], align_corners=False)
    image = tf.subtract(image, 0.5)
    image = tf.multiply(image, 2.0) # -1 to 1
    image = tf.squeeze(image,[0])
    return image



filelist = gfile.ListDirectory("images")
sess = tf.Session()
with sess.as_default():
    x = np.array([np.array(     preprocess(os.path.join("images", filename)).eval()      ) for filename in filelist])

input_shape = (IMAGE_HEIGHT, IMAGE_WIDTH, 1)   # 1, because preprocessing made grayscale

# in our case the labels come from part of the filename
y = np.array([int(filename[filename.index('_')+1:-4]) for filename in filelist])
# convert class labels to numbers
y = keras.utils.to_categorical(y, NUM_CLASSES)

########## TODO: something here? ##########
image = K.placeholder(shape=(), dtype=tf.string)
decoded = tf.image.decode_jpeg(image, channels=3)
# scores = build_model(decoded)


model = Sequential()

# model.add(decoded)

model.add(Conv2D(32, kernel_size=(2, 2), activation='relu', input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.25))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
            optimizer=keras.optimizers.Adadelta(),
            metrics=['accuracy'])

model.fit(
    x,
    y,
    batch_size=64,
    epochs=20,
    verbose=1,
    validation_split=0.2,
    shuffle=False
    )

predict_signature = tf.saved_model.signature_def_utils.build_signature_def(
    inputs={'input_bytes':tf.saved_model.utils.build_tensor_info(model.input)},
    ########## TODO: something here? ##########
    # inputs={'input': image },    # input name must have "_bytes" suffix to use base64.
    outputs={'formId': tf.saved_model.utils.build_tensor_info(model.output)},
    method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME
)

builder = tf.saved_model.builder.SavedModelBuilder("exported_model")

builder.add_meta_graph_and_variables(
    sess=K.get_session(),
    tags=[tf.saved_model.tag_constants.SERVING],
    signature_def_map={
        tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: predict_signature
    },
    legacy_init_op=tf.group(tf.tables_initializer(), name='legacy_init_op')
)

builder.save()

这与我的previous question相关。

更新

问题的核心是如何将调用解码的占位符合并到Keras模型中。换句话说，在创建将base64字符串解码为张量的占位符之后，如何将其合并到Keras运行的内容中？我认为它需要是一个层。

image = K.placeholder(shape=(), dtype=tf.string)
decoded = tf.image.decode_jpeg(image, channels=3)
model = Sequential()

# Something like this, but this fails because it is a tensor, not a Keras layer.  Possibly this is where a Lambda layer comes in?
model.add(decoded)
model.add(Conv2D(32, kernel_size=(2, 2), activation='relu', input_shape=input_shape))
...

更新2：

尝试使用lambda图层来完成此任务......

import keras
from keras.models import Sequential
from keras.layers import Lambda
from keras import backend as K
import tensorflow as tf

image = K.placeholder(shape=(), dtype=tf.string)
model = Sequential()
model.add(Lambda(lambda image: tf.image.decode_jpeg(image, channels=3), input_shape=() ))

给出错误：TypeError: Input 'contents' of 'DecodeJpeg' Op has type float32 that does not match expected type of string.

Answer 1

首先，我使用tf.keras，但这应该不是什么大问题。因此，这是一个如何读取base64解码的jpeg的示例：

def preprocess_and_decode(img_str, new_shape=[299,299]):
    img = tf.io.decode_base64(img_str)
    img = tf.image.decode_jpeg(img, channels=3)
    img = tf.image.resize_images(img, new_shape, method=tf.image.ResizeMethod.BILINEAR, align_corners=False)
    # if you need to squeeze your input range to [0,1] or [-1,1] do it here
    return img
InputLayer = Input(shape = (1,),dtype="string")
OutputLayer = Lambda(lambda img : tf.map_fn(lambda im : preprocess_and_decode(im[0]), img, dtype="float32"))(InputLayer)
base64_model = tf.keras.Model(InputLayer,OutputLayer)

上面的代码创建一个模型，该模型采用任意大小的jpeg，将其调整为299x299的大小，并以299x299x3的张量返回。该模型可以直接导出到saved_model，并用于Cloud ML Engine服务。这有点愚蠢，因为它唯一要做的就是将base64转换为张量。

如果您需要将此模型的输出重定向到现有经过训练和编译的模型（例如inception_v3）的输入，则必须执行以下操作：

base64_input = base64_model.input
final_output = inception_v3(base64_model.output)
new_model = tf.keras.Model(base64_input,final_output)

可以保存此new_model。它使用base64 jpeg，并返回由inception_v3部分标识的类。

Answer 2

另一个答案建议添加tf.placeholder类型为tf.string，这是有道理的，但如何将其纳入Keras模型？

在Keras中，您可以通过以下方式访问所选的后端（在本例中为Tensorflow）：

from keras import backend as K

这似乎已经导入了您的代码。这将使您能够访问您选择的后端可用的一些本机方法和资源。情况是 Keras后端包括一种创建占位符的方法，以及其他实用程序。关于占位符，我们可以看到Keras docs对它们的指示：

<强>占位符

keras.backend.placeholder（shape = None，ndim = None，dtype = None，sparse = False，name = None）

实例化占位符张量并返回它。

它还给出了一些使用它的例子：

>>> from keras import backend as K
>>> input_ph = K.placeholder(shape=(2, 4, 5))
>>> input_ph._keras_shape
(2, 4, 5)
>>> input_ph
<tf.Tensor 'Placeholder_4:0' shape=(2, 4, 5) dtype=float32>

正如您所看到的，这将返回一个Tensorflow张量，形状（2,4,5）和dtype float。如果你在做这个例子时有另一个后端你会得到另一个张量对象（一个Theano肯定）。因此，您可以使用此placeholder()来调整您之前solution上的question。

总之，您可以使用导入为K（或任何您想要的）的后端来对您选择的后端上可用的方法和对象进行调用，方法是对所需的K.foo.bar()进行调用方法。我建议你先阅读Keras Backend来了解更多可能对你未来情况有用的内容。

更新：根据您的修改。是的，此占位符应该是模型中的图层。具体来说，它应该是模型的输入层，因为它保存您的解码图像（正如Keras需要的那样）进行分类。

使用Keras和Google Cloud ML的Base64图像

2 个答案: