Question

目标：从给定的远程视频流中以 最小延迟/延迟 实时检测人。

设置：

Raspberry Pi（2）带USB网络摄像头，可通过Flask提供图像/视频流。
本地计算机（macbook pro）获取视频流，通过OpenCV，Darknet / DarkFlow / Yolo和Tensorflow处理图像。
与检测到的人一起显示获得的已处理流。被检测到的人周围会有一个矩形。
Python 3

我目前正在使用基本功能 BUT ，这似乎很慢。当我需要在不到一秒钟的时间内处理图像时，大约每隔几秒钟就会处理一次。因此，结果是一个视频，该视频显示了流后面的更新方式，而且断断续续。通过搜索，这似乎是一个普遍的问题，但是我似乎还没有找到直接的答案。

我已经像某些论坛所说的那样，将流抓取作为其自己的线程来实现，但是我认为现在的问题只是处理抓取的图像所需的时间。

是否可以提高性能？我是否需要在提供良好GPU的系统上的云中进行此处理，以便可以利用性能提升的优势？我使用了错误的yolo重量和cfg吗？我知道yolov3已售罄，但我认为在与我的环境一起使用时遇到了问题。

incoming_frames = queue.Queue()

class Stream(threading.Thread):
    def __init__(self, ID):
        threading.Thread.__init__(self)
        self.cam=cv2.VideoCapture('http://raspberrypi.local:5000/')

    def run(self):
        frame_id = 0
        while True:
            ret,frame=self.cam.read()
            if ret:
                frame_id = frame_id + 1
                frame_dict = {}
                frame_dict['frame'] = frame
                frame_dict['id'] = frame_id
                incoming_frames.put(frame_dict)
                print("ACQUIRED FRAME " + str(frame_id))
                time.sleep(0.1)
    def stop(self):
        self._stop_event.set()

print("[INFO] Starting Process......")

print("[INFO] Load Model / Weights")
options = {"model": "cfg/yolo.cfg", "load": "bin/yolo.weights", "threshold": 0.1}
tfnet = TFNet(options)

print("[INFO] Start Video Grab Thread")
stream = Stream(0)
stream.start()


while True:
    if(not incoming_frames.empty()):
        frame = incoming_frames.get()
        result = tfnet.return_predict(frame['frame'])
        print("Processing Frame " + str(frame['id']))
        coordinates = []
        for detection in result:
            if detection['label'] == 'person' and detection['confidence'] >= 0.4:
                cv2.rectangle(frame['frame'], (detection['topleft']['x'], detection['topleft']['y']),
                   (detection['bottomright']['x'], detection['bottomright']['y']),
                    (0, 255, 0), 2)
                body = {'x': detection['topleft']['x'], 'y': detection['topleft']['y'],
                        'width': (detection['bottomright']['x'] - detection['topleft']['x']),
                        'height': (detection['bottomright']['y'] - detection['topleft']['y'])}
                coordinates.append(body)
        cv2.rectangle(frame['frame'], frame['x1'], frame['y1'], frame['x2'], frame['y2'], (0, 255, 0), 2)
        cv2.imshow('Video', frame['frame'])
        cv2.waitKey(1)

stream.stop()
cv2.destroyAllWindows()

从视频流中接近实时地执行人员检测的问题

0 个答案: