Question

我试图在python中执行以下操作：

抓拍屏幕
如果屏幕截图包含给定的参考图像（可以是jpg或pgn），请在屏幕上获取此图像的坐标

更多信息：

参考图像不会很大（5x5像素就够了）
它应该尽可能快，因为它应该不断扫描屏幕
如果可能：在Windows和Linux上工作

什么是最好的方法，我可以在python中实现这一点？

编辑：感谢Leo Antunes，我提出了以下解决方案：

def bitmap2brg(bmp):
  w = bmp.width;
  h = bmp.height;
  a = np.empty((h, w, 3), dtype=np.uint8);

  for r in xrange(h):
    for c in xrange(w):
      v = bmp.get_color(c, r);
      a[r, c, 2] = (v >> 16) & 0xFF;
      a[r, c, 1] = (v >> 8) & 0xFF;
      a[r, c, 0] = v & 0xFF;
  return a;

def grabScreen():
  THRESHOLD = 1

  # reference image
  needle = cv2.imread('img_top_left.png')
  needle_height, needle_width, needle_channels = needle.shape

  # Grabbing with autopy
  screen = autopy.bitmap.capture_screen()
  haystack = bitmap2brg(screen)

  # work through the frame looking for matches above a certain THRESHOLD
  # and mark them with a green circle
  matches = 0
  for pt in np.transpose(np.where(cv2.matchTemplate(haystack, needle, cv2.TM_CCOEFF_NORMED) >= THRESHOLD)):
      cv2.circle(haystack, (pt[1] + needle_width/2, pt[0] + needle_height/2), 10, (0,255,0))
      matches += 1

  # display the altered frame
  print "Number of matches: {}".format(matches)
  cv2.imshow('matches', haystack)
  if cv2.waitKey(0) & 0xFF == ord('q'):
        cv2.destroyAllWindows()

Answer 1

更新：通过OP自己（autopy）找到一个很棒的搜索结果，“正确”的解决方案变得非常简单：

import autopy

needle = autopy.bitmap.Bitmap.open('needle.png')

while True:
        haystack = autopy.bitmap.capture_screen()
        found = haystack.find_every_bitmap(needle)
        print(found)

以下原始次优答案：

您可以使用OpenCV相对轻松地对视频执行此类操作。唯一的问题是以便携方式捕获屏幕。我能找到的唯一理论上可移植的库是this one，但我无法让它工作。

我没有使用真正可移植的库，而是围绕ffmpeg解决了一个适用于linux的解决方案，theoretically could be made to work on Windows and OSX as well：

import cv2
import numpy as np
import subprocess

THRESHOLD = 0.7

# your reference image
needle = cv2.imread('/path/to/some/needle.jpg', cv2.CV_LOAD_IMAGE_GRAYSCALE)
needle_height, needle_width = needle.shape

width, height = (800, 600) # you could of course detect this

command = [ '/usr/bin/ffmpeg',
        '-f', 'x11grab',
        '-i', ':0.0+0,0',
        '-r', '3', # lower frame-rate for testing
        '-s', '%dx%d' % (width, height),
        '-f', 'rawvideo',
        '-vcodec', 'rawvideo',
        '-pix_fmt', 'bgr24',
        '-']

pipe = subprocess.Popen(command, stdout = subprocess.PIPE, bufsize=10**8)

while(True):
    # get a frame
    raw = pipe.stdout.read(width*height*3)
    # format it into a matrix which can be worked on by openCV
    original = np.fromstring(raw, dtype='uint8').reshape((height, width, 3))
    pipe.stdout.flush()

    # transform it to grayscale for the matching
    haystack = cv2.cvtColor(original, cv2.COLOR_BGR2GRAY)

    # work through the frame looking for matches above a certain THRESHOLD
    # and mark them with a green circle
    for pt in np.transpose(np.where(cv2.matchTemplate(haystack, needle, cv2.TM_CCOEFF_NORMED) >= THRESHOLD)):
        cv2.circle(original, (pt[1] + needle_width/2, pt[0] + needle_height/2), 10, (0,255,0))

    # display the altered frame
    cv2.imshow('matches', original)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

请注意，从管道读取时使用的神奇“3”来自ffmpeg的输出像素格式：bgr 24 。

捕获屏幕并找到参考图像

1 个答案: