Question

我需要一个Python程序，我正在努力拍摄一个小图像，确定它是否存在于较大的图像中，如果存在，则报告其位置。如果没有，请报告。（在我的例子中，大图像将是一个屏幕截图，小图像是一个可能在屏幕上或不在屏幕上的图像，在HTML5画布中。）在线看，我发现了OpenCV中的模板匹配，拥有出色的Python绑定。我尝试了以下内容，基于我在线找到的非常相似的代码，使用numpy：

import cv2
import numpy as np
image = cv2.imread("screenshot.png")
template = cv2.imread("button.png")
result = cv2.matchTemplate(image,template,cv2.TM_CCOEFF_NORMED)
StartButtonLocation = np.unravel_index(result.argmax(),result.shape)

这并不是我需要它做的事情，因为它总是在较大的图像中返回一个点;无论比赛多么糟糕，比赛最接近的地方。我想要找到一个精确的像素，用于较大图像中较小图像的像素匹配，如果不存在，则引发异常，或返回False或类似的东西。而且，它需要相当快。有没有人对如何做到这一点有很好的想法？

Answer 1

如果您在尺寸和图片价值方面都寻找exact match，我会提出一个快速而完美的答案。

我们的想法是在较大的h x w图片中计算想要的H x W 模板的强力搜索。强力方法包括查看图像上所有可能的h x w窗口，并检查模板内的逐像素对应关系。然而，这在计算上非常昂贵，但可以加速。

im = np.atleast_3d(im)
H, W, D = im.shape[:3]
h, w = tpl.shape[:2]

通过使用智能integral images，可以非常快速地计算从每个像素开始的h x w窗口内的总和。积分图像是一个求和面积表（累积求和数组），可以用numpy非常快速地计算得出：

sat = im.cumsum(1).cumsum(0)

并且它具有非常好的属性，例如仅使用4个算术运算计算窗口中所有值的总和：

From wikipedia

因此，通过计算模板的总和并将其与整数图像上的h x w窗口的总和相匹配，很容易找到内部值之和相同的“可能窗口”列表作为模板中值的总和（快速近似）。

iA, iB, iC, iD = sat[:-h, :-w], sat[:-h, w:], sat[h:, :-w], sat[h:, w:]
lookup = iD - iB - iC + iA

以上是图像中所有可能的h x w矩形在图像上显示的操作的numpy矢量化（因此，非常快）。

这将减少很多可能的窗口数（在我的一个测试中为2）。最后一步是检查与模板的完全匹配：

posible_match = np.where(np.logical_and.reduce([lookup[..., i] == tplsum[i] for i in range(D)]))
for y, x in zip(*posible_match):
    if np.all(im[y+1:y+h+1, x+1:x+w+1] == tpl):
        return (y+1, x+1)

请注意，此处y和x坐标对应于图像中的A点，即模板的上一行和列。

全部放在一起：

def find_image(im, tpl):
    im = np.atleast_3d(im)
    tpl = np.atleast_3d(tpl)
    H, W, D = im.shape[:3]
    h, w = tpl.shape[:2]

    # Integral image and template sum per channel
    sat = im.cumsum(1).cumsum(0)
    tplsum = np.array([tpl[:, :, i].sum() for i in range(D)])

    # Calculate lookup table for all the possible windows
    iA, iB, iC, iD = sat[:-h, :-w], sat[:-h, w:], sat[h:, :-w], sat[h:, w:] 
    lookup = iD - iB - iC + iA
    # Possible matches
    possible_match = np.where(np.logical_and.reduce([lookup[..., i] == tplsum[i] for i in range(D)]))

    # Find exact match
    for y, x in zip(*possible_match):
        if np.all(im[y+1:y+h+1, x+1:x+w+1] == tpl):
            return (y+1, x+1)

    raise Exception("Image not found")

适用于灰度和彩色图像，并使用7ms模板在303x384彩色图片的50x50中运行。

一个实际的例子：

>>> from skimage import data
>>> im = gray2rgb(data.coins())
>>> tpl = im[170:220, 75:130].copy()

>>> y, x = find_image(im, tpl)
>>> y, x
(170, 75)

并说明结果：

enter image description here

左侧原始图像，右侧模板。在这里完全匹配：

>>> fig, ax = plt.subplots()
>>> imshow(im)
>>> rect = Rectangle((x, y), tpl.shape[1], tpl.shape[0], edgecolor='r', facecolor='none')
>>> ax.add_patch(rect)

enter image description here

最后，只是测试possible_matches的一个例子：

enter image description here

图像中两个窗口的总和是相同的，但是函数的最后一步过滤了与模板不完全匹配的那个窗口。

Answer 2

由于您对OpenCV感到满意，我建议您从已经完成的工作开始并获得最佳匹配。获得最佳匹配位置后，您可以检查它是否真的匹配。

检查它是否匹配应该像提取匹配的图像并将其与模板进行比较一样简单。要提取图像，您可能需要使用cv2.minMaxLoc(result)并处理输出。提取方法似乎取决于用于比较图像的方法，并使用示例here完成。

提取图像后，您应该能够使用numpy.allclose或其他方法对它们进行比较。

Answer 3

我尝试使用最后一个脚本来查找嵌入在目录中的图像，但这不起作用，这是我的工作：

import cv2
import numpy as np
import os
import glob

pic2 = "/home/tse/Images/pictures/20/redu.png"
path = "/home/tse/Images/pictures/20/*.png"
for pic1 in glob.glob(path):
    def find_image(pic1, pic2):
        dim1_ori = pic1.shape[0]
        dim2_ori = pic1.shape[1]
        dim1_emb = pic2.shape[0]
        dim2_emb = pic2.shape[1]

        v1_emb = pic2[0, 0]
        v2_emb = pic2[0, dim2_emb - 1]
        v3_emb = pic2[dim1_emb - 1, dim2_emb - 1]
        v4_emb = pic2[dim1_emb - 1, 0]

        mask = (pic1 == v1_emb).all(-1)
        found = 0

        if np.sum(mask) > 0: # Check if a pixel identical to v1_emb
            result = np.argwhere(mask)
            mask = (result[:, 0] <= dim1_ori - dim1_emb) & (result[:, 1] <= dim2_ori - dim2_emb)

            if np.sum(mask) > 0: # Check if the pixel induce a rectangl
                result = result[mask] + [0, dim2_emb - 1]
                mask = [(pic1[tuple(coor)] == v2_emb).all(-1) for coor in result]

                if np.sum(mask) > 0: # Check if a pixel identical to v2_emb
                    result = result[mask] + [dim1_emb-1, 0]
                    mask = [(pic1[tuple(coor)] == v3_emb).all(-1) for coor in result]

                    if np.sum(mask) > 0: # Check if a pixel identical to v3_emb
                        result = result[mask] - [0, dim2_emb - 1]
                        mask = [(pic1[tuple(coor)] == v4_emb).all(-1) for coor in result]

                        if np.sum(mask) > 0: # Check if a pixel identical to v4_emb
                            result = result[mask]
                            result[:, 0] = result[:, 0] - (dim1_emb - 1)
                            result = np.c_[result, result[:, 0] + dim1_emb, result[:, 1] + dim2_emb]

                            for coor in result: # Check if the induced rectangle is indentical to the embedding
                                induced_rectangle = pic1[coor[0]:coor[2], coor[1]:coor[3]]
                                if np.array_equal(induced_rectangle, pic2):
                                    found = 1
                                    break
        if found == 0:
            return('No image found')
            print("Not found")
        else:
            return('Image found')
            print("Found")

Answer 4

这是对@Imanol Luengo 功能的改进。为了减少计算，我们首先过滤与模板左上顶点相同的像素。然后我们只检查由这些像素引起的矩形。

def find_image(pic1, pic2): # pic1 is the original, while pic2 is the embedding

    dim1_ori = pic1.shape[0]
    dim2_ori = pic1.shape[1]

    dim1_emb = pic2.shape[0]
    dim2_emb = pic2.shape[1]

    v1_emb = pic2[0, 0]
    v2_emb = pic2[0, dim2_emb - 1]
    v3_emb = pic2[dim1_emb - 1, dim2_emb - 1]
    v4_emb = pic2[dim1_emb - 1, 0]

    mask = (pic1 == v1_emb).all(-1)
    found = 0

    if np.sum(mask) > 0: # Check if a pixel identical to v1_emb
        result = np.argwhere(mask)
        mask = (result[:, 0] <= dim1_ori - dim1_emb) & (result[:, 1] <= dim2_ori - dim2_emb)

        if np.sum(mask) > 0: # Check if the pixel induce a rectangle
            result = result[mask] + [0, dim2_emb - 1]
            mask = [(pic1[tuple(coor)] == v2_emb).all(-1) for coor in result]

            if np.sum(mask) > 0: # Check if a pixel identical to v2_emb
                result = result[mask] + [dim1_emb-1, 0]
                mask = [(pic1[tuple(coor)] == v3_emb).all(-1) for coor in result]

                if np.sum(mask) > 0: # Check if a pixel identical to v3_emb
                    result = result[mask] - [0, dim2_emb - 1]
                    mask = [(pic1[tuple(coor)] == v4_emb).all(-1) for coor in result]

                    if np.sum(mask) > 0: # Check if a pixel identical to v4_emb
                        result = result[mask]
                        result[:, 0] = result[:, 0] - (dim1_emb - 1)
                        result = np.c_[result, result[:, 0] + dim1_emb, result[:, 1] + dim2_emb]

                        for coor in result: # Check if the induced rectangle is indentical to the embedding
                            induced_rectangle = pic1[coor[0]:coor[2], coor[1]:coor[3]]
                            if np.array_equal(induced_rectangle, pic2):
                                found = 1
                                break
    if found == 0:
        return('No image found')
    else:
        return('Image found')

确定图像是否存在于较大的图像中，如果存在，则使用Python找到它

4 个答案: