Question

我已经训练了一个定制的Keras网络，并且希望将其部署在MCU上。我必须将其量化为UINT8。

model = tf.keras.models.load_model('saved_model/MaskNet_extended.h5')
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = [tf.uint8]
converter.inference_output_type = [tf.uint8]
converter.representative_dataset = rep_ds
tflite_quant_model = converter.convert()

问题在于tf_lite_quant_model是stil Float32。这怎么可能？

网络是：

model = Sequential([
    Conv2D(16, 3, padding='same', activation='relu', 
           input_shape=(IMG_SHAPE)),
    MaxPooling2D(),
    Conv2D(32, 3, padding='same', activation='relu'),
    MaxPooling2D(),
    Conv2D(64, 3, padding='same', activation='relu'),
    MaxPooling2D(),
    Flatten(),
    Dense(512, activation='relu'),
    Dense(1, activation = 'sigmoid')
])

Answer 1

这是TFLiteConverterV2中的一个已知问题，https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/tools/optimize/python/modify_model_interface.py中有一种解决方法。

TensorFlow Lite也将很快输出uint8。

TFLite模型转换器不输出uint8

1 个答案: