Question

我有一个庞大的图像数据集，在白纸上的任意位置都有一些徽标。如何使用python从图像中检索对象的坐标（左上角和右下角）？

对于前者，请考虑此图像 http://ak9.picdn.net/shutterstock/videos/5360279/thumb/3.jpg（忽略阴影）我想在图像中突出显示蛋。

编辑：图像是高分辨率的＆amp;计数非常大，因此迭代解决方案需要花费大量时间。我错过的一件事是图像以1位模式存储。所以我认为我们可以使用numpy获得更好的解决方案。

Answer 1

如果图片的其余部分是一种颜色，你可以比较每个像素，并找到一个不同的颜色，表明图片的开头像这样请注意我假设右上角是背景颜色，如果情况并非总是这样，请使用不同的方法（例如计数模式像素颜色）！：

import numpy as np
from PIL import Image
import pprint

def get_y_top(pix, width, height, background, difference):
    back_np = np.array(background)
    for y in range(0, height):
        for x in range(0, width):
            if max(np.abs(np.array(pix[x, y]) - back_np)) > difference:
                return y

def get_y_bot(pix, width, height, background, difference):
    back_np = np.array(background)
    for y in range(height-1, -1,  -1):
        for x in range(0, width):
            if max(np.abs(np.array(pix[x, y]) - back_np)) > difference:
                return y

def get_x_left(pix, width, height, background, difference):
    back_np = np.array(background)
    for x in range(0, width):
        for y in range(0, height):
            if max(np.abs(np.array(pix[x, y]) - back_np)) > difference:
                return x

def get_x_right(pix, width, height, background, difference):
    back_np = np.array(background)
    for x in range(width-1, -1, -1):
        for y in range(0, height):
            if max(np.abs(np.array(pix[x, y]) - back_np)) > difference:
                return x

img = Image.open('test.jpg')
width, height = img.size
pix = img.load()
background = pix[0,0]

difference = 20 #or whatever works for you here, use trial and error to establish this number
y_top = get_y_top(pix, width, height, background, difference)
y_bot = get_y_bot(pix, width, height, background, difference)
x_left = get_x_left(pix, width, height, background, difference)
x_right = get_x_right(pix, width, height, background, difference)

使用此信息，您可以裁剪图像并保存：

img = img.crop((x_left,y_top,x_right,y_bot))
img.save('test3.jpg')

结果如下：

Answer 2

对于这张图片（白色bg上的鸡蛋）：

您可以按以下步骤裁剪：

阅读并转换为灰色

阈值和反转

找到极限坐标并裁剪

鸡蛋图片，尺寸为(480, 852, 3)，费用为0.016s。

代码：

## Time passed: 0.016 s

#!/usr/bin/python3
# 2018/04/10 19:39:14
# 2018/04/10 20:25:36 
import cv2
import numpy as np
import matplotlib.pyplot as plt

import time
ts = time.time()

## 1. Read and convert to gray
fname = "egg.jpg"
img = cv2.imread(fname)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

##  2. Threshold and Invert
th, dst = cv2.threshold(gray, 240, 255, cv2.THRESH_BINARY_INV)

##  3. Find the extreme coordinates and crop 
ys, xs = np.where(dst>0)
target = img[ys.min():ys.max(), xs.min():xs.max()]

te = time.time()
print("Time passed: {:.3f} s".format(te-ts))
plt.imshow(target)
plt.show()

## Time passed: 0.016 s

从白皮书中裁剪徽标

2 个答案: