因此,我目前正在尝试优化在GPU上训练的模型以在CPU上运行。在下面,您可以看到我用来推断CPU的代码,以及“按原样” +冻结模型(CPU和GPU性能)的基准测试
因此,我希望获得一些有关优化模型使用以更好地在CPU上运行的建议。 当前系统: 2个1070GTX 6600k 16GB
模型“按原样”进行推理的基准时间:每张图片0.3秒
冻结模型的基准测试时间:每张图片0.15秒
冻结模型的CPU推断基准时间:每张图片1.15秒
if __name__ == '__main__':
graph = load_graph("frozen_model.pb")
x = graph.get_tensor_by_name('prefix/Gs/latents_in:0')
x2 = graph.get_tensor_by_name('prefix/Gs/labels_in:0')
y = graph.get_tensor_by_name('prefix/Gs/images_out:0')
with tf.Session(graph=graph, config = config) as sess:
latents = np.random.randn(1, 512).astype(np.float32)
labels = np.zeros([latents.shape[0], 0], np.float32)
i = 0
y_out = sess.run(y, feed_dict = { x: latents, x2: labels})
while True:
start_time = time.time()
i += 1
y_out = sess.run(y, feed_dict = { x: latents, x2: labels})
data = y_out[0, :, :, :]
data = data * 127.5
data = data + 127.5
data = convert_to_pil_image(data)
print("--- %s seconds ---" % (time.time() - start_time))
latents += random.uniform(-0.5,0.5)