系统信息
您正在使用的模型的顶级目录是什么: object_detection / ssd_inception_v2
我是否编写过自定义代码(与使用TensorFlow中提供的库存示例脚本相反):否
在我的自定义数据集上训练ssd_inception_v2模型后,我想用它进行推理。由于推理后来应该在没有GPU的设备上运行,所以我只是为了推断而切换到CPU。我调整了opject_detection_tutorial.ipynb以测量推理时间,并让以下代码在视频中的一系列图像上运行。
with detection_graph.as_default():
with tf.Session(graph=detection_graph) as sess:
while success:
#print(str(datetime.datetime.now().time()) + " " + str(count))
#read image
success,image = vidcap.read()
#resize image
image = cv2.resize(image , (711, 400))
# crop image to fit 690 x 400
image = image[ : , 11:691]
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_np_expanded = np.expand_dims(image, axis=0)
#print(image_np_expanded.shape)
image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
# Each box represents a part of the image where a particular object was detected.
boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
# Each score represent how level of confidence for each of the objects.
# Score is shown on the result image, together with the class label.
scores = detection_graph.get_tensor_by_name('detection_scores:0')
classes = detection_graph.get_tensor_by_name('detection_classes:0')
num_detections = detection_graph.get_tensor_by_name('num_detections:0')
before = datetime.datetime.now()
# Actual detection.
(boxes, scores, classes, num_detections) = sess.run(
[boxes, scores, classes, num_detections],
feed_dict={image_tensor: image_np_expanded})
print("This took : " + str(datetime.datetime.now() - before))
vis_util.visualize_boxes_and_labels_on_image_array(
image,
np.squeeze(boxes),
np.squeeze(classes).astype(np.int32),
np.squeeze(scores),
category_index,
use_normalized_coordinates=True,
line_thickness=8)
#cv2.imwrite("converted/frame%d.jpg" % count, image) # save frame as JPEG file
count += 1
输出如下:
这需要:0:00:04.289925
这需要:0:00:00.909071
这需要:0:00:00.917636
这需要:0:00:00.908391
这需要:0:00:00.896601
这需要:0:00:00.908698
这花了:0:00:00.890018
这需要:0:00:00.896373
.....
当然,每张图像900毫秒的速度不足以进行视频处理。在阅读了很多主题之后,我看到了两种可能的改进方法:
所以我的问题是,如果上述两项改进有可能增加对实时使用的推断(10 - 20 fps),或者我在这里错误的路径并且应该尝试别的吗?欢迎任何建议。