Question

如何最好地将预处理层（例如，减去平均值并除以std）添加到keras（v2.0.5）模型，以使模型完全自包含以进行部署（可能在C ++环境中）。我试过了：

    def getmodel():
       model = Sequential()
       mean_tensor = K.placeholder(shape=(1,1,3), name="mean_tensor")
       std_tensor = K.placeholder(shape=(1,1,3), name="std_tensor")

       preproc_layer = Lambda(lambda x: (x - mean_tensor) / (std_tensor + K.epsilon()),
                              input_shape=im_shape)

       model.add(preproc_layer)

       # Build the remaining model, perhaps set weights,
       ...

       return model

然后，在其他地方设置模型的均值/标准。我找到了set_value函数，因此尝试了以下内容：

m = getmodel()
mean, std = get_mean_std(..)

graph = K.get_session().graph
mean_tensor = graph.get_tensor_by_name("mean_tensor:0")
std_tensor = graph.get_tensor_by_name("std_tensor:0")

K.set_value(mean_tensor, mean)
K.set_value(std_tensor, std)

然而set_value失败并带有

AttributeError: 'Tensor' object has no attribute 'assign'

所以set_value不起作用（有限的）文档建议。这样做的正确方法是什么？获取TF会话，将所有培训代码包装在with (session)中并使用feed_dict？我原本以为会有一种原生的keras方式来设置张量值。

我没有使用占位符，而是尝试使用K.variable或K.constant设置模型构造的均值/标准：

mean_tensor = K.variable(mean, name="mean_tensor")
std_tensor = K.variable(std, name="std_tensor")

这可以避免任何set_value问题。虽然我注意到如果我尝试训练那个模型（我知道它不是特别有效，因为你正在为每个图像重新进行规范化）它可以工作但是在第一个时期结束时ModelCheckpoint处理程序失败了一个非常深的堆栈跟踪：

...
File "/Users/dgorissen/Library/Python/2.7/lib/python/site-packages/keras/models.py", line 102, in save_model
  'config': model.get_config()
File "/Users/dgorissen/Library/Python/2.7/lib/python/site-packages/keras/models.py", line 1193, in get_config
  return copy.deepcopy(config)
File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/copy.py", line 163, in deepcopy
  y = copier(x, memo)
...
File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/copy.py", line 190, in deepcopy
  y = _reconstruct(x, rv, 1, memo)
File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/copy.py", line 343, in _reconstruct
  y.__dict__.update(state)
AttributeError: 'NoneType' object has no attribute 'update'

更新1：

我也尝试过不同的方法。正常训练模型，然后只添加第二个进行预处理的模型：

# Regular model, trained as usual
model = ...

# Preprocessing model
preproc_model = Sequential()
mean_tensor = K.constant(mean, name="mean_tensor")
std_tensor = K.constant(std, name="std_tensor")
preproc_layer = Lambda(lambda x: (x - mean_tensor) / (std_tensor + K.epsilon()),
                       input_shape=im_shape, name="normalisation")
preproc_model.add(preproc_layer)

# Prepend the preprocessing model to the regular model    
full_model = Model(inputs=[preproc_model.input],
              outputs=[model(preproc_model.output)])

# Save the complete model to disk
full_model.save('full_model.hdf5')

这似乎一直有效，直到save()调用，它失败并具有与上面相同的深堆栈跟踪。也许Lambda层是问题，但是从this issue开始，它似乎应该正确地序列化。

总的来说，如何在不损害序列化（并导出到pb）的能力的情况下将规范化层附加到keras模型？

我确定你可以通过直接下降到TF（例如this thread或使用tf.Transform）来实现它的工作，但我认为它可以直接用于keras。

更新2：

所以我发现可以通过

来避免深层堆栈跟踪

def foo(x):
    bar = K.variable(baz, name="baz")
    return x - bar

因此在函数内定义bar而不是从外部范围捕获。

然后我发现我可以保存到磁盘但无法从磁盘加载。围绕这个问题有一套github问题。我使用#5396中指定的解决方法将所有变量作为参数传递，然后允许我保存并加载。

以为我几乎在那里，我继续使用上面 Update 1 的方法，在训练模型前面堆叠预处理模型。这导致Model is not compiled错误。解决了这些问题，但最终我从未设法让以下工作：

构建并训练模型
将其保存到磁盘
加载它，添加预处理模型
将堆叠的模型导出为磁盘作为冻结的pb文件
从磁盘加载冻结的pb
将其应用于一些看不见的数据

我已经达到没有错误的程度，但无法使标准化张量传播到冻结的pb。花了太多时间在这上面，然后我放弃了，转而采用不那么优雅的方法：

从头开始在模型中使用预处理操作构建模型，但设置为no-op（mean = 0，std = 1）
训练模型，建立一个相同的模型，但这次使用适当的mean / std。
转移重量
将模型导出并冻结为pb

所有这些现在完全按预期工作。培训费用很小但对我来说可以忽略不计。

仍未能弄清楚如何在keras中设置张量变量的值（不会引发assign异常），但现在可以不使用它。

接受@ Daniel的回答，因为它让我朝着正确的方向前进。

相关问题：

Add Tensorflow pre-processing to existing Keras model (for use in Tensorflow Serving)

Answer 1

创建变量时，必须为其赋予“值”，而不是形状：

mean_tensor = K.variable(mean, name="mean_tensor")
std_tensor = K.variable(std, name="std_tensor")

现在，在Keras中，您不必处理会话，图形和类似的事情。您只能使用图层，而在Lambda图层（或损失函数）中，您可以使用张量。

对于我们的Lambda图层，我们需要一个更复杂的函数，因为在进行计算之前，形状必须匹配。由于我不知道im_shape，我认为它有3个维度：

def myFunc(x):

    #reshape x in a way it's compatible with the tensors mean and std:
    x = K.reshape(x,(-1,1,1,3)) 
        #-1 is like a wildcard, it will be the value that matches the rest of the given shape.     
        #I chose (1,1,3) because it's the same shape of mean_tensor and std_tensor

    result = (x - mean_tensor) / (std_tensor + K.epsilon())

    #now shape it back to the same shape it was before (which I don't know)    
    return K.reshape(result,(-1,im_shape[0], im_shape[1], im_shape[2]))
        #-1 is still necessary, it's the batch size

现在我们创建Lambda图层，考虑到它还需要一个输出形状（因为你的自定义操作，系统不一定知道输出形状）

model.add(Lambda(myFunc,input_shape=im_shape, output_shape=im_shape))

在此之后，只需编译模型并训练它。（通常使用model.compile(...)和model.fit(...)）

如果你想包含所有内容，包括函数内的预处理，也可以：

def myFunc(x):

    mean_tensor = K.mean(x,axis=[0,1,2]) #considering shapes of (size,width, heigth,channels)    
    std_tensor = K.std(x,axis=[0,1,2])

    x = K.reshape(x, (-1,3)) #shapes of mean and std are (3,) here.    
    result = (x - mean_tensor) / (std_tensor + K.epsilon())

    return K.reshape(result,(-1,width,height,3))

现在，所有这些都是模型中的额外计算，并将消耗处理。最好只做模型之外的所有事情。首先创建预处理数据并将其存储，然后在没有此预处理层的情况下创建模型。这样您就可以获得更快的模型。（如果您的数据或模型太大，这可能很重要。）

将预处理层添加到keras模型并设置张量值

1 个答案: