系统信息
描述当前行为
从“ https://www.tensorflow.org/lite/models”下载“ Mobilenet_V1_1.0_224”,然后对由“精简工具”转换的原始冻结模型文件和tflite文件进行延迟测试。
冻结模型测试
# some code to import graph from frozen model file
input_tensor = sess.graph.get_tensor_by_name('input:0')
output_tensor = sess.graph.get_tensor_by_name('MobilenetV1/Predictions/Reshape_1:0')
test_case = load_test_data(data_file, resize)
res = sess.run(output_tensor, feed_dict={input_tensor: test_case})
s = timeit.default_timer()
for i in range(1000):
res = sess.run(output_tensor, feed_dict={input_tensor: test_case})
e = timeit.default_timer()
print("avg cost {} s".format((e - s)/1000))
我得到了平均费用 0.0266642200947 s
由Lite转换的tflite模型
# Load TFLite model and allocate tensors.
interpreter = tf.contrib.lite.Interpreter(model_path=model_file)
interpreter.allocate_tensors()
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
print(input_details)
print(output_details)
# Test model on random input data.
input_shape = input_details[0]['shape']
input_size = (input_shape[1], input_shape[2])
print("size: ", input_size)
# input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
input_data = load_test_data(data_file, resize)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
print("Got {} by prob: {}".format(labels[np.argmax(output_data)], np.max(output_data)))
s = timeit.default_timer()
for i in range(200):
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
e = timeit.default_timer()
print("avg cost {} s".format((e - s)/200))
print("Got {} by prob: {}".format(labels[np.argmax(output_data)], np.max(output_data)))
然后我得到平均成本0.0471310245991 s 对于量化模型,平均成本甚至为0.133453620672 s
我期待延迟性能的提升,但这似乎是相反的。
TensorFlow Lite工具是否只是针对特定的嵌入平台(例如运行什么官方bechmark)优化模型? 还是我做错了导致性能下降的事情?