Question

我一直在尝试寻找可以自动找到两幅图像之间所有共享区域的东西，基于像素匹配或差异来显式地 not ，并且公平地说，我基本上一无所获一点搜索。

说我有以下两个图像，在这种情况下，是网站屏幕截图。首先是“基准”：

和第二个非常相似，但带有一些修改过的CSS，因此整个块都被移动了。文本内容没有变化，盒子尺寸也没有变化，只是重新定位了一些元素：

在这种情况下（但实际上在每两个要比较其中一个是另一幅的派生的图像的其他情况下），它们的像素差异实际上对于观察变化是毫无用处的：

实际上，即使我们应用了一些简单的差异夸张，结果仍然是 still 毫无用处，因为我们仍在查看像素差异，而不是根据更改的差异，因此我们不会（以任何方式）查看对视觉信息的实际修改：

所以这就像比较两本书，然后根据n的多少个值来确定这两本书是不同的，我们可以找到book1.letters[n] != book2.letters[n]的哪个值...

所以，我正在寻找一种计算相似区域的方法，该方法显示两个图像的哪些部分编码相同的信息，但不一定在相同的边界框中。

例如，在上面的两个图像中，几乎所有数据都是相同的，只是其中一些部分已重定位。唯一真正的区别是那里有一个神秘的空白。

具有相似区域的颜色编码：

和对应关系：

我找不到单个工具来执行此操作，甚至找不到可以使用opencv或类似技术实现此功能的教程。也许我在寻找错误的术语，也许没有人为此写过一个图像比较工具（这似乎令人难以置信？），所以冒着这样的风险：我searched and researched和我一样多可以，在这里。如果我需要此工具作为可以作为常规（开源）工具链的一部分运行以进行质量检查/测试的工具，该怎么办？（因此：对于同样昂贵的商业软件来说，不是昂贵的插件）。

Answer 1

这是对初始区域聚类的建议。

首先，我们减去2张图像以找出不同的区域。然后我们将其调整为较小的尺寸，以实现更快的速度和更容易的群集。

然后，我们执行形态学关闭操作以将所有附近对象聚在一起。

阈值获得强信号

运行连接的组件分析以获取所有边界框。

然后检查所有框形交点并将其合并。就我而言，我只是以实体模式重新绘制了所有边界框，然后重新分析组件以获得区域

有了这个，我们可以在第二张图像上运行相同的过程，并使用简单的互相关匹配方法或任何其他奇特的匹配方法对提取的每个区域进行交叉匹配。在这种情况下，区域之间的简单宽度和高度匹配也可以。

这是我编写的代码。希望对您有所帮助。

import cv2
import numpy as np


# Function to fill all the bounding box
def fill_rects(image, stats):

    for i,stat in enumerate(stats):
        if i > 0:
            p1 = (stat[0],stat[1])
            p2 = (stat[0] + stat[2],stat[1] + stat[3])
            cv2.rectangle(image,p1,p2,255,-1)


# Load image file
img1 = cv2.imread('img1.jpg',0)
img2 = cv2.imread('img2.jpg',0)

# Subtract the 2 image to get the difference region
img3 = cv2.subtract(img1,img2)

# Make it smaller to speed up everything and easier to cluster
small_img = cv2.resize(img3,(0,0),fx = 0.25, fy = 0.25)


# Morphological close process to cluster nearby objects
fat_img = cv2.dilate(small_img, None,iterations = 3)
fat_img = cv2.erode(fat_img, None,iterations = 3)

fat_img = cv2.dilate(fat_img, None,iterations = 3)
fat_img = cv2.erode(fat_img, None,iterations = 3)

# Threshold strong signals
_, bin_img = cv2.threshold(fat_img,20,255,cv2.THRESH_BINARY)

# Analyse connected components
num_labels, labels, stats, centroids = cv2.connectedComponentsWithStats(bin_img)

# Cluster all the intersected bounding box together
rsmall, csmall = np.shape(small_img)
new_img1 = np.zeros((rsmall, csmall), dtype=np.uint8)

fill_rects(new_img1,stats)


# Analyse New connected components to get final regions
num_labels_new, labels_new, stats_new, centroids_new = cv2.connectedComponentsWithStats(new_img1)


labels_disp = np.uint8(200*labels/np.max(labels)) + 50
labels_disp2 = np.uint8(200*labels_new/np.max(labels_new)) + 50



cv2.imshow('diff',img3)
cv2.imshow('small_img',small_img)
cv2.imshow('fat_img',fat_img)
cv2.imshow('bin_img',bin_img)
cv2.imshow("labels",labels_disp)
cv2.imshow("labels_disp2",labels_disp2)
cv2.waitKey(0)

Answer 2

回答我自己的问题：opencv (for python) 与 scikit-image 配对几乎可以通过两个步骤让我们达到目标。

对两张图片进行 SSIM 比较，捕捉 bbox 轮廓相对于第二张图片的各种差异
对于第二幅图像中的每个轮廓，对第一幅图像执行 template matching，这会告诉我们差异轮廓是“变化”还是“平移”。

在代码中，假设两个图像 imageA 和 imageB，具有相同的尺寸：

import cv2
import imutils
from skimage.metrics import structural_similarity

# ...a bunch of functions will be going here...

diffs = compare(imageA, imageB, gray(imageA), gray(imageB), [])

if len(diffs) > 0:
    highlight_diffs(imageA, imageB, diffs)

else:
    print("no differences detected")

与：

def gray(img):
    return cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

def compare(or1, or2, im1, img2, diffs):
    (score, diff) = structural_similarity(im1, img2, full=True)
    diff = (diff * 255).astype("uint8")

    thresh = cv2.threshold(diff, 0, 255,
        cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
    contours = cv2.findContours(thresh.copy(),
        cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    contours = imutils.grab_contours(contours)

    # aggregate the contours, throwing away duplicates
    for c in contours:
        (x, y, w, h) = cv2.boundingRect(c)
        region = [x, y, x + w, y + h]
        try:
            diffs.index(region)
        except ValueError:
            diffs.append(region)

    return diffs

现在，cv2.RETR_EXTERNAL 应该只产生“外部轮廓”，例如如果有 inside 其他差异（比如一个盒子的边框颜色改变了，盒子里面的一些文本也改变了），它应该只产生一个盒子，作为外部（“外部”）盒子。 >

除了这不是它所做的，所以我写了一个愚蠢的函数来天真地清除内盒：

def filter_diffs(diffs):
    def not_contained(e, diffs):
        for t in diffs:
            if e[0] > t[0] and e[2] < t[2] and e[1] > t[1] and e[3] < t[3]:
                return False
        return True

    return [e for e in diffs if not_contained(e, diffs)]

然后在使用颜色矩形突出显示差异的函数中使用它。

RED = (0,0,255)

def highlight_diffs(a, b, diffs):
    diffed = b.copy()

    for area in filter_diffs(diffs):
        x1, y1, x2, y2 = area
        cv2.rectangle(diffed, (x1, y1), (x2, y2), RED, 2)

    cv2.imshow("Diffed", diffed)

这让我们进入了第一部分。截取 Stackoverflow 的屏幕截图，然后将左侧广告向下移动并重新着色 --yellow-100 CSS 变量后的另一个屏幕截图：

这会找到五个差异，但其中两个并不是真正意义上的“差异”，因为它是新内容或已删除的内容，而是“我们将内容向下移动”的结果。

那么，让我们添加模板匹配：

def highlight_diffs(a, b, diffs):
    diffed = b.copy()

    for area in filter_diffs(diffs):
        x1, y1, x2, y2 = area

        # is this a relocation, or an addition/deletion?
        org = find_in_original(a, b, area)
        if org is not None:
            cv2.rectangle(a, (org[0], org[1]), (org[2], org[3]), BLUE, 2)
            cv2.rectangle(diffed, (x1, y1), (x2, y2), BLUE, 2)
        else:
            cv2.rectangle(diffed, (x1+2, y1+2), (x2-2, y2-2), GREEN, 1)
            cv2.rectangle(diffed, (x1, y1), (x2, y2), RED, 2)

    cv2.imshow("Original", a)
    cv2.imshow("Diffed", diffed)
    cv2.waitKey(0)

使用以下模板匹配代码，对于“我们发现的匹配是否真的很好”有一个非常严格的阈值：

def find_in_original(a, b, area):
    crop = b[area[1]:area[3], area[0]:area[2]]
    result = cv2.matchTemplate(crop, a, cv2.TM_CCOEFF_NORMED)

    (minVal, maxVal, minLoc, maxLoc) = cv2.minMaxLoc(result)
    (startX, startY) = maxLoc
    endX = startX + (area[2] - area[0])
    endY = startY + (area[3] - area[1])
    ocrop = a[startY:endY, startX:endX]

    # this basically needs to be a near-perfect match
    # for us to consider it a "moved" region rather than
    # a genuine difference between A and B.
    if structural_similarity(gray(ocrop), gray(crop)) >= 0.99:
        return [startX, startY, endX, endY]

我们现在可以比较原始图片和修改后的图片，看到广告在修改后的图片中移动了，而不是“新内容”，我们可以看到它在原始图片中的位置：

就是这样，我们有一个视觉差异，它实际上告诉我们一些关于变化的有用信息，而不是告诉我们哪个像素恰好是不同的颜色。

我们可以将模板匹配阈值降低一点，比如 0.95，在这种情况下，空白框最终也会与原始图像匹配，但是因为它只是空白，所以它会匹配到几乎没有意义的东西（在这种特殊情况下，它将与原始文件右下角的空格匹配）。

当然，生活质量的改进将是在颜色之间循环，以便各种移动的部分都可以通过它们共享的颜色相互关联，但这是任何人都可以自己在此代码之上添加的东西.

Answer 3

建议：

如果您能够分割出蓝色的句子，则可以大大缓解该问题，这可以通过形态扩展然后进行二值化来实现。如果膨胀足够强，以至于所有字符都可以接触（尽管文本的不同行保持分开），则连接的组件标签可以提取孔线。

现在您有了边界框，可以尝试的位置数量大大减少了。

还要看看diff算法，它与顺序文本有关。 https://en.wikipedia.org/wiki/Diff

Answer 4

不幸的是，我无法产生准确的预期结果，但是通过一个相当钝的算法，我得到了一些接近。通用算法为：

向每个图像添加相同的随机噪声。

请参见图1中的第一和第三窗格。向两个图像添加相同的噪点可确保可以通过相位相关性（如下）比较无特征的区域（例如白色背景）。
使用图像1中的带框小节填充零矩阵

图1的中间窗格中提供了此矩阵的示例。此图像的尺寸必须与图像1和图像2相同。

在第二步的矩阵与噪点图像2之间执行相位相关。

您可以在此处旋转多个旋钮，以改善最终效果。参见Performing a phase correlation with fft in R

提取与最高相关值相关的x和y值“偏移”

这些值表示如何将步骤2的矩阵在x和y方向上移动，以使其最佳匹配嘈杂图像2。

在步骤3中调整装箱的小节的位置，以使嘈杂图像1中的所有区域都被环绕。然后重复步骤3和4。

这是通过遍历图像1中的行和列来完成的。您可以遍历每个索引或跳过几个索引。

创建一个x和y位移矩阵，并进行绘图以观察image1中的区域与image 2的比较。

请注意，结果并不完全是您要查找的结果，而是接近的结果。基本上，红色区域表示图像1中的相应区域不必移动。黄色区域（在这种情况下）需要稍微向下移动，橙色区域需要更多移动，白色区域需要向上移动。

同样，向图像1和2添加相同的噪声是重要的一步。该算法取决于隔离小框区域（在示例代码中，我使用了50x50像素框）。当您遍历图像1的行和列并隔离相应的装箱区域时，几个区域将包含没有特征的区域。这给相位相关性带来了问题，因为装箱的无特征区域将在具有相似无特征背景的所有区域中具有多个高相关值。有效地，添加噪声会为两个图像添加特征，以减少模糊的相位相关性。

此算法产生的结果与期望的结果不同的原因是，由于没有以巧妙的方式选择框状区域-当您在图像1的行和列上循环时会选择它们。在您选择的盒子大小上，某些盒子区域的特征与图像2相比具有不同的转换。在yapws87提出的区域聚类算法之后，此算法可能会更好地工作。

以下是产生这些结果的R代码：

## read in the images 
img1 <- readJPEG('./img1.jpg')
img2 <- readJPEG('./img2.jpg')

## grayscale the images
img1 <- (img1[,,1]+img1[,,2]+img1[,,3])/3
img2 <- (img2[,,1]+img2[,,2]+img2[,,3])/3


## rotate the images for more intuitive R plotting
img1 <- t(apply(img1,2,rev))
img2 <- t(apply(img2,2,rev))

## create some uniform noise 
noise <- matrix(runif(n=nrow(img1)*ncol(img1)),nrow=nrow(img1),ncol=ncol(img1))*0.1

## add the SAME noise to both images
img1 <- noise+img1
img2 <- noise+img2

## remove the mean from both images (this may not be necessary) 
img1 <- img1/mean(img1)
img2 <- img2/mean(img2)

## Take the conjugate of the fft of the second image
IMG2c <- Conj(fft(img2))

## define how to loop through the first image
row.step=50
col.step=50

## create a zero image (made with all 0s)
zero.img <- matrix(0,ncol=ncol(img1),nrow=nrow(img1))

## initialize some vectors to hold the x and y
## shifts that correspond to the highest phase correlation value
shift.x.vec=NULL
shift.y.vec=NULL

## keep track of how many iterations you go through
i.iters=1

## loop over the columns
i=1
while((i+col.step-1)<nrow(img1)) {

    ## keep track of how many iterations you go through
    j.iters=1

    ## loop over the rows
    j=1
    while((j+col.step-1)<ncol(img1)) {

        ## define a current 'box' as the zero image
        cbox1 <- zero.img

        ## then populate a small box with values from image 1
        cbox1[i:(i+row.step-1),j:(j+col.step-1)] <- img1[i:(i+row.step-1),j:(j+col.step-1)]

        ## PERFORM THE PHASE CORRELATION

        ## go into the frequency domain
        CBOX1 <- fft(cbox1)

        ## find a normalized value
        norm <- abs(CBOX1 * IMG2c)

        ## perform the phase correlation and go back to the space domain
        corr <- Re(fft((CBOX1 * IMG2c)/norm,inv=TRUE)/length(CBOX1))

        ## this rearranges the quadrants of the matrix see
        ## matlabs function fftshift
        corr <- fftshift(corr)

        ## find the x and y index values associated with the
        ## highest correlation value.
        shift <- which(corr==max(corr),arr.ind=TRUE)
        shift.x <- shift[1]
        shift.y <- shift[2]

        ## populate the x and y shift vectors
        shift.x.vec <- c(shift.x.vec,shift.x)
        shift.y.vec <- c(shift.y.vec,shift.y)

        ## THIS IS ADDITIONAL PLOTTING AND CAN BE IGNORED
        if(i.iters==6 & j.iters==6) {
            dev.new()
            ##jpeg('./example.jpeg',width=900,height=700)
            split.screen(c(1,3))
            screen(1)
            image(1:nrow(img1),1:ncol(img1),img1,col=gray.colors(200),axes=FALSE,ylab="",xlab="",useRaster=TRUE,main='Noisy Image 1')
            rect(j,i,(j+col.step-1),(i+row.step-1))

            screen(2)
            image(cbox1,col=gray.colors(200),axes=FALSE,useRaster=TRUE,main='Current Box')

            screen(3)
            image(img2,col=gray.colors(200),axes=FALSE,useRaster=TRUE,main='Noisy Image 2')

            ##dev.off()
        }




        j.iters=j.iters+1
        j=j+row.step
    }

    i.iters=i.iters+1
    i=i+col.step

}

## make a matrix of shifts values
## in this example, only the y shifts are interesting though
shift.x.mat <- matrix(shift.x.vec,ncol=j.iters-1,nrow=i.iters-1,byrow=TRUE)
shift.y.mat <- matrix(shift.y.vec,ncol=j.iters-1,nrow=i.iters-1,byrow=TRUE)


##jpeg('./final.jpeg',width=800,height=800)
image(shift.y.mat,axes=FALSE,useRaster=TRUE)
##dev.off()

如何找到两个图像之间的所有共享区域

4 个答案: