Question

有一张图像（imA）尺寸10x10像素和更多60 000张图像（imN）10x10

所有图像均为黑白

找到用于区分第一张图像（imA）与其他图像（imN）的最小点数的任务 - 抱歉我的英文不好，我添加了img和评论

我做的第一件事就是将所有图像转换为带有numpy的矩阵

q=0
for file in inputImages:
    eachImage = os.path.join(generatorFolder, file)
    a[q]=numpy.asarray(Image.open(eachImage))
    q+=1

b=numpy.asarray(Image.open(templateimage))

b [y，x，color]为其列表[255,255,255]

着色

一个[1-60000，Y，X，颜色]

接下来我使用嵌套比较，3点深度的非递归搜索看起来像这样：

for y1 in range(b.shape[0]):
    for x1 in range(b.shape[1]):
        for y2 in range(b.shape[0]):
            for x2 in range(b.shape[1]):
                for y3 in range(b.shape[0]):
                    for x3 in range(b.shape[1]):
                        if y1==y2==y3 and x1==x2==x3:continue

                        check=0
                        for a_el in range(a.shape[0]):
                            if numpy.array_equal(b[y1,x1],a[a_el,y1,x1]) and \
                               numpy.array_equal(b[y2,x2],a[a_el,y2,x2]) and \
                               numpy.array_equal(b[y3,x3],a[a_el,y3,x3]):
                                check=1
                                break

                        if not check:return 'its unic dots'

此代码的问题在于它非常慢。例如，我们第一个图像与其他图像不同，至少有五个点：

得到100！ / 95！ * 60 000比较 - 542,070,144,000,000

是的，我使用稍微不同的算法，可以将其转换为： 40！/ 35！* 60000 = 4.737.657.600.000，不是太少。

有没有办法解决我的问题更美丽，而不是暴力。

更新添加img

enter image description here

0行：3其他图像（imN）4x4

1行：0模板图像（imA）和1-3图像中红色标记差异（imA XOR imN）

2行：0图像，其中蓝色标记两点两点进行比较，

    1 image green its difference, red its compare - difference yes - NEXT

    2 image red its compare - difference NO - Break (these two points is not enough to say that imA differs from imN(2))

3行：像第2行其他点

4行：我们选择两个点足以说imA与imN（1-3）不同

Answer 1

如果我理解你的问题，你需要计算第一张图片上的点数，这与其他图片的全部不同，不管其他图片彼此之间有何不同？

如果是这种情况，除非我遗漏了某些内容，否则您不能简单地执行以下操作：

boolean[10][10] DIFFS // all values set to TRUE
int[10][10] ORIGINAL  // store first pictures color values

foreach IMAGE in [IMAGES - FIRST IMAGE] {
    int[10][10] CURRENT <- IMAGE // store the current image's color values
    for (i : 0 -> 9) {
        for (j : 0 -> 9) {
            if (DIFFS[i][j]) {
                DIFFS[i][j] = ORIGINAL[i][j] != CURRENT[i][j]
            }
        }
    }
}

然后你留下一个二维矩阵DIFFS，其中每个位置指示原始图像中的相应像素是否与所有其他图像不同。

Answer 2

我的方法是：

将图像读入60,000 x 100阵列，设置为1和0。
以每个像素为基础对它们求和，以计算每个像素“设置”为1的图像数
选择参考图像中总和最低的像素。（如果总和为0，则只需要该像素来区分参考图像和所有其他像素）
现在只查看具有该位设置的图像，重新计算总和并再次选择最低值。
迭代重复，直到选择参考图像中的所有设置位（这意味着无法区分它或者需要所有位）或直到总和等于1，这意味着只有一个图像具有那一点。

在代码中，1000张4x4图像：

import numpy

def least_freq_set_pixel(array, must_have):
    # If it is specified that a certain pixels must be set, remove rows that don't have those pixels set
    if must_have != []:
        for i in must_have:
            array = numpy.delete(array, numpy.where(array[:,i] == 0), axis = 0)

    # Calculate the sum for each pixel
    set_in_n = numpy.sum(array, axis = 0)
    my_index = numpy.argmin(set_in_n)

    # Return the pixel number which is set in the fewest images
    return my_index, set_in_n[my_index]


# Create some test data 4x4 images
numpy.random.seed(11)
a = numpy.array([0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0])
b = numpy.random.randint(0,2,(1000,16))


must_have = []
stop = 0
while stop == 0:
    i,j = least_freq_set_pixel(b, must_have)
    print i,j
    # If the pixel is set in more than one image and not all pixels have been selected yet... find the next pixel
    if j > 1 and len(must_have) <= 16:
        must_have.append(i)
    else:
        stop = 1
        print must_have

告诉我们，我们需要16个像素中的7个像素将参考图像与其余图像分开，像素为0,1,2,4,5,10和15。

Answer 3

10x10 = 100.两个图像之间的100次比较。你有60000张图片。我认为算法必须是O（100 * 60000）= O（6000000）。我不知道python，但伪算法应该是这样的：

int minDistinguishPoints = 100; 
int currentPointsDiff;
Image imA;

foreach (Image myImg in ImageItems)
{
    currentPointsDiff = 0;
    for (int i=0; i<10; i++)
       for (int j=0; j<10; j++)
       {
           if (!imA.GetPixel(i,j).Equals(myImg.GetPixel(i,j)))
           {
               currentPointsDiff++;
           }                
       }
    if (minDistinguishPoints > currentPointsDiff)
    {
        minDistinguishPoints = currentPointsDiff;
    }
}

可能是我不明白的问题。如果是这样，请详细解释一下。

Answer 4

如果我理解正确，我们可以完全重新定义您的问题。我认为您想要实现的目标：快速识别某个给定图像是否等于预定义的60000或其中没有一个。每张图片都是10x10黑/白。

因此，每个图像都可以解释为10x10 = 100位数组，并且您有60000个预定值，您想要比较它。

为什么不直接将60000图像转换为100位整数，并对它们进行排序。然后，您可以非常有效地比较任何100位整数并找到命中或未命中。

编辑：如果我理解正确的评论，图像会更大。只要已知的数量仍然是可管理的数量（60k，也可能是600k），您可以为这些图像生成哈希值并进行排序和比较。你有一些前期计算成本，但你只有一次。

算法快速比较图像\矩阵

4 个答案: