我正在尝试使用张量流进行人检测。
这是我想出的影片
我的问题。 1.我在GPU 1050ti nvidia上运行tensorflow,它的运行速度极慢 2.检测仍然会闪烁,但是由于卡尔曼过滤器的作用,它确实保持了一致的ID,但是如何使检测平滑并保持检测而不会闪烁?
这是我正在使用的检测代码
def get_localization(self, image, visual=False):
"""Determines the locations of the cars in the image
Args:
image: camera image
Returns:
list of bounding boxes: coordinates [y_up, x_left, y_down, x_right]
"""
category_index = {1: {'id': 1, 'name': u'person'},
2: {'id': 2, 'name': u'bicycle'},
3: {'id': 3, 'name': u'car'},
4: {'id': 4, 'name': u'motorcycle'},
5: {'id': 5, 'name': u'airplane'},
6: {'id': 6, 'name': u'bus'},
7: {'id': 7, 'name': u'train'},
8: {'id': 8, 'name': u'truck'},
9: {'id': 9, 'name': u'boat'},
10: {'id': 10, 'name': u'traffic light'},
11: {'id': 11, 'name': u'fire hydrant'},
13: {'id': 13, 'name': u'stop sign'},
14: {'id': 14, 'name': u'parking meter'}}
with self.detection_graph.as_default():
image_expanded = np.expand_dims(image, axis=0)
(boxes, scores, classes, num_detections) = self.sess.run(
[self.boxes, self.scores, self.classes, self.num_detections],
feed_dict={self.image_tensor: image_expanded})
if visual == True:
vis_util.visualize_boxes_and_labels_on_image_array(
image,
np.squeeze(boxes),
np.squeeze(classes).astype(np.int32),
np.squeeze(scores),
category_index,
use_normalized_coordinates=True, min_score_thresh=.4,
line_thickness=3)
plt.figure(figsize=(9, 6))
plt.imshow(image)
plt.show()
boxes = np.squeeze(boxes)
classes = np.squeeze(classes)
scores = np.squeeze(scores)
cls = classes.tolist()
# The ID for car in COCO data set is 3
idx_vec = [i for i, v in enumerate(cls) if ((v == 1) or (v == 3) or (v == 2)or (v == 4)or (v == 8)and (scores[i] > 0.6))]
if len(idx_vec) == 0:
print('no detection!')
self.car_boxes = []
else:
tmp_car_boxes = []
for idx in idx_vec:
dim = image.shape[0:2]
box = self.box_normal_to_pixel(boxes[idx], dim, scores[idx])
box_h = box[2] - box[0]
box_w = box[3] - box[1]
ratio = box_h / (box_w + 0.01)
if ((ratio < 0.8 or ratio > 0.8) and (box_h > 20) and (box_w > 20)):
tmp_car_boxes.append(box)
#print(box, ', confidence: ', scores[idx], 'ratio:', ratio)
#else:
#print('wrong ratio or wrong size, ', box, ', confidence: ', scores[idx], 'ratio:', ratio)
self.car_boxes = tmp_car_boxes
return self.car_boxes
答案 0 :(得分:0)
问题之一是,您在输入管道中使用feed dict,这是一件非常低效的事情。您基本上是从OpenCV C ++到Python,再到TensorFlow C ++,这会浪费很多性能并导致GPU瓶颈。
不幸的是,我无法直接解决如何在TensorFlow中结合OpenCV解决此问题。我过去一直在使用Nvidias TensorRT,它的确非常好。此框架的基本方法是:
在此过程中,您还可以“剔除”模型中一些不必要的部分,以提高性能。
此link中显示了另一种方法。他正在使用某种多线程多进程混合来获取视频数据!