Question

我对数据集进行了暗流训练，并取得了良好的效果！我可以将其预先录制的图像或视频提供给它，并在正确的事物周围绘制边界框，赢了！

现在，我想像使用相机供稿一样实时运行它，只是我希望供稿来自屏幕，而不是相机。我有一个特定的窗口，该窗口是从特定的过程启动的，或者我可以只取一部分屏幕（来自坐标），对于我的应用程序来说还是合适的。

当前，我使用PIL图像抓取，然后将图像送入darkflow，但这感觉很慢（也许每秒几帧），没有视频文件可以达到的30 ish fps！

Answer 1

在Ubuntu下，我的慢速笔记本电脑上Python MSS的速度超过25 fps。

这里是一个例子：

from mss import mss
from PIL import Image
import time

def capture_screenshot():
    with mss() as sct:
        monitor = sct.monitors[1]
        sct_img = sct.grab(monitor)
        # Convert to PIL/Pillow Image
        return Image.frombytes('RGB', sct_img.size, sct_img.bgra, 'raw', 'BGRX')

N = 100
t = time.time()
for _ in range(N):
    capture_screenshot()
print ("Frame rate = %.2f fps" % (N/(time.time()-t)))

输出：

Frame rate = 27.55 fps

Answer 2

我使用此脚本获得了40 fps以上的速度（在i5-7500 3.4GHz，GTX 1060、48GB RAM上）。有很多用于捕获屏幕的API。其中，mss的运行速度更快，并且不难使用。这是具有darkflow（YOLOv2）的mss的实现，其中“ mon”定义要在屏幕上应用预测的区域。

选项传递给darkflow，该流指定我们要使用的配置文件和检查点，检测阈值以及该进程占用GPU的空间。在运行此脚本之前，我们必须至少具有一个受过训练的模型（或Tensorflow检查点）。在这里， load 是检查点编号。

如果您认为网络检测到太多的边界框，建议您降低阈值。

import numpy as np
import cv2
import glob
from moviepy.editor import VideoFileClip
from mss import mss
from PIL import Image
from darkflow.net.build import TFNet
import time

options = {
    'model' : 'cfg/tiny-yolo-voc-1c.cfg' ,
    'load' : 5500,
    'threshold' : 0.1,
    'gpu' : 0.7 }
tfnet = TFNet( options )
color = (0, 255, 0) # bounding box color.

# This defines the area on the screen.
mon = {'top' : 10, 'left' : 10, 'width' : 1000, 'height' : 800}
sct = mss()
previous_time = 0
while True :
    sct.get_pixels(mon)
    frame = Image.frombytes( 'RGB', (sct.width, sct.height), sct.image )
    frame = np.array(frame)
    # image = image[ ::2, ::2, : ] # can be used to downgrade the input
    frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    results = tfnet.return_predict( frame )
    for result in results :
        tl = ( result['topleft']['x'], result['topleft']['y'] )
        br = ( result['bottomright']['x'], result['bottomright']['y'] )
        label = result['label']
        confidence = result['confidence']
        text = '{} : {:.0f}%'.format( label, confidence * 100 )
        frame = cv2.rectangle( frame, tl, br, color, 5 )
        frame = cv2.putText( frame, text, tl, cv2.FONT_HERSHEY_COMPLEX, 1, (0, 0, 0), 2 )
    cv2.imshow ( 'frame', frame )
    if cv2.waitKey ( 1 ) & 0xff == ord( 'q' ) :
        cv2.destroyAllWindows()
    txt1 = 'fps: %.1f' % ( 1./( time.time() - previous_time ))
    previous_time = time.time()
    print txt1

如何将屏幕用作暗流的视频输入

2 个答案: