我想在屏幕上进行一些模式识别,并使用Quartz / PyObjc库来获取屏幕截图。
我将截图设为CGImage。我想使用openCV库在其中搜索模式,但似乎无法找到如何将数据转换为opencv可读的数据。
所以我想做的是:
#get screenshot and reference pattern
img = getScreenshot() # returns CGImage instance, custom function, using Quartz
reference = cv2.imread('ref/reference_start.png') #get the reference pattern
#search for the pattern using the opencv library
result = cv2.matchTemplate(screen, reference, cv2.TM_CCOEFF_NORMED)
#this is what I need
minVal,maxVal,minLoc,maxLoc = cv2.minMaxLoc(result)
我不知道如何做到这一点,无法通过谷歌查找信息。
答案 0 :(得分:3)
要添加到Arqu的答案,您可能会发现使用np.frombuffer更快,而不是首先创建PIL图像,如果您的最终目标是使用opencv或numpy,因为np.frombuffer需要大约相同的时间作为Image.frombuffer,但节省了从Image转换为numpy数组的步骤(在我的机器上需要大约100ms(其他一切需要大约50ms)。)
import Quartz.CoreGraphics as CG
from PIL import Image
import time
import numpy as np
ct = time.time()
region = CG.CGRectInfinite
# Create screenshot as CGImage
image = CG.CGWindowListCreateImage(
region,
CG.kCGWindowListOptionOnScreenOnly,
CG.kCGNullWindowID,
CG.kCGWindowImageDefault)
width = CG.CGImageGetWidth(image)
height = CG.CGImageGetHeight(image)
bytesperrow = CG.CGImageGetBytesPerRow(image)
pixeldata = CG.CGDataProviderCopyData(CG.CGImageGetDataProvider(image))
image = np.frombuffer(pixeldata, dtype=np.uint8)
image = image.reshape((height, bytesperrow//4, 4))
image = image[:,:width,:]
print('elapsed:', time.time() - ct)
答案 1 :(得分:1)
我一直在玩这个,但是我需要更多的性能,所以保存到文件然后再次读取它有点太慢了。经过大量的搜索和摆弄后,我想出了这个:
#get_pixels returns a image reference from CG.CGWindowListCreateImage
imageRef = self.get_pixels()
pixeldata = CG.CGDataProviderCopyData(CG.CGImageGetDataProvider(imageRef))
image = Image.frombuffer("RGBA", (self.width, self.height), pixeldata, "raw", "RGBA", self.stride, 1)
#Color correction from BGRA to RGBA
b, g, r, a = image.split()
image = Image.merge("RGBA", (r, g, b, a))
另请注意,由于我的图像不是标准尺寸(必须填充),它有一些奇怪的行为所以我必须调整缓冲区的步幅,如果你从标准屏幕宽度拍摄全屏截图,你可以大步走0,它将自动计算。
现在,您可以将PIL格式转换为numpy数组,以便在OpenCV中更轻松地使用:
image = np.array(image)
答案 2 :(得分:0)
以下代码将截取屏幕截图并将其保存到文件中。要将其读入PIL,只需使用标准Image(path)
即可。如果您保持区域的大小很小,这段代码的速度会惊人地快。对于800x800像素区域,我的i7每次拍摄时间不到50ms。对于双显示器设置(2880x1800 + 2560x1440)的全分辨率,每次拍摄大约需要1.9秒。
来源:https://github.com/troq/flappy-bird-player/blob/master/screenshot.py
import Quartz
import LaunchServices
from Cocoa import NSURL
import Quartz.CoreGraphics as CG
def screenshot(path, region=None):
"""saves screenshot of given region to path
:path: string path to save to
:region: tuple of (x, y, width, height)
:returns: nothing
"""
if region is None:
region = CG.CGRectInfinite
# Create screenshot as CGImage
image = CG.CGWindowListCreateImage(
region,
CG.kCGWindowListOptionOnScreenOnly,
CG.kCGNullWindowID,
CG.kCGWindowImageDefault)
dpi = 72 # FIXME: Should query this from somewhere, e.g for retina displays
url = NSURL.fileURLWithPath_(path)
dest = Quartz.CGImageDestinationCreateWithURL(
url,
LaunchServices.kUTTypePNG, # file type
1, # 1 image in file
None
)
properties = {
Quartz.kCGImagePropertyDPIWidth: dpi,
Quartz.kCGImagePropertyDPIHeight: dpi,
}
# Add the image to the destination, characterizing the image with
# the properties dictionary.
Quartz.CGImageDestinationAddImage(dest, image, properties)
# When all the images (only 1 in this example) are added to the destination,
# finalize the CGImageDestination object.
Quartz.CGImageDestinationFinalize(dest)
if __name__ == '__main__':
# Capture full screen
screenshot("testscreenshot_full.png")
# Capture region (100x100 box from top-left)
region = CG.CGRectMake(0, 0, 100, 100)
screenshot("testscreenshot_partial.png", region=region)
答案 3 :(得分:0)
这是Arqu答案的增强版本。 PIL(至少是Pillow)可以直接加载BGRA数据,而无需拆分和合并。
width = Quartz.CGImageGetWidth(cgimg)
height = Quartz.CGImageGetHeight(cgimg)
pixeldata = Quartz.CGDataProviderCopyData(Quartz.CGImageGetDataProvider(cgimg))
bpr = Quartz.CGImageGetBytesPerRow(image)
# Convert to PIL Image. Note: CGImage's pixeldata is BGRA
image = Image.frombuffer("RGBA", (width, height), pixeldata, "raw", "BGRA", bpr, 1)
答案 4 :(得分:0)
所有这些答案都忽略了汤姆·甘米尼斯(Tom Gangemis)对this答案的评论。宽度不是64的倍数的图片将被拧紧。我使用np大步走了一个有效的方法:
cg_img = CG.CGWindowListCreateImage(
CG.CGRectNull,
CG.kCGWindowListOptionIncludingWindow,
wnd_id,
CG.kCGWindowImageBoundsIgnoreFraming | CG.kCGWindowImageNominalResolution
)
bpr = CG.CGImageGetBytesPerRow(cg_img)
width = CG.CGImageGetWidth(cg_img)
height = CG.CGImageGetHeight(cg_img)
cg_dataprovider = CG.CGImageGetDataProvider(cg_img)
cg_data = CG.CGDataProviderCopyData(cg_dataprovider)
np_raw_data = np.frombuffer(cg_data, dtype=np.uint8)
np_data = np.lib.stride_tricks.as_strided(np_raw_data,
shape=(height, width, 3),
strides=(bpr, 4, 1),
writeable=False)