Question

我有一个问题，即传递给Lambda层的值（在编译时）是由keras生成的占位符（无值）。编译模型时，.eval（）方法将引发错误：

您必须使用dtype输入占位符张量'input_1'的值字符串和形状[？，1]

def text_preprocess(x):
  strings = tf.keras.backend.eval(x)
  vectors = []
  for string in strings:
    vector = string_to_one_hot(string.decode('utf-8'))
    vectors.append(vector)
  vectorTensor = tf.constant(np.array(vectors),dtype=tf.float32)
  return vectorTensor

input_text = Input(shape=(1,), dtype=tf.string)
embedding = Lambda(text_preprocess)(input_text)
dense = Dense(256, activation='relu')(embedding)
outputs = Dense(2, activation='softmax')(dense)

model = Model(inputs=[input_text], outputs=outputs)
model.compile(loss='categorical_crossentropy',optimizer='adam', metrics=['accuracy'])
model.summary()
model.save('test.h5')

如果我将字符串数组静态传递到输入层，则可以编译模型，但是如果我想将模型转换为tflite，则会遇到相同的错误。

#I replaced this line:
input_text = Input(shape=(1,), dtype=tf.string)

#by this lines:
test = tf.constant(["Hello","World"])
input_text = Input(shape=(1,), dtype=tf.string, tensor=test)

#but calling this ...
converter = TFLiteConverter.from_keras_model_file('string_test.h5')
tfmodel = converter.convert()

#... still leads to this error:

InvalidArgumentError：必须输入占位符张量的值 dtype字符串和形状为[2] [[{{node input_3}}]]的'input_3'

Answer 1

好吧，我终于以这种方式解决了这个问题：

def text_preprocess(x):
  b = tf.strings.unicode_decode(x,'UTF-8')
  b = b.to_tensor(default_value=0)
  #do things with decoded string
  one_hot = K.one_hot(b,one_hot_size)
  return one_hot

...

如何在Keras模型Lambda层中预处理字符串？

1 个答案: