Question

如下图所示的白色背景图像，在（红色）红色背景下，一些文本为黑色，一些文本为红色。文本的位置（无论是否带有背景）都不固定。

我只想复制文字图像。

我想到的一种方法是将红色背景替换为白色，但是红色文本也不可避免地消失了。

这是我尝试过的：

from PIL import Image

import numpy as np

orig_color = (255,0,0)
replacement_color = (255,255,255)
img = Image.open("C:\\TEM\\AB.png").convert('RGB')
data = np.array(img)
data[(data == orig_color).all(axis = -1)] = replacement_color
img2 = Image.fromarray(data, mode='RGB')
img2.show()

结果如下：

仅保留图片的所有文字的最佳方法是什么？（理想情况如下）

谢谢。

Answer 1

这是我仅使用图像的红色和绿色通道的方法（使用OpenCV，有关解释请参见代码中的注释）：

import cv2
import imageio
import numpy as np

# extract red and green channel from the image
r, g = cv2.split(imageio.imread('https://i.stack.imgur.com/bMSzZ.png'))[:2]

imageio.imsave('r-channel.png', r)
imageio.imsave('g-channel.png', g)

# white image as canvas for drawing contours
canvas = np.ones(r.shape, np.uint8) * 255

# find contours in the inverted green channel 
# change [0] to [1] when using OpenCV 3, in which contours are returned secondly
contours = cv2.findContours(255 - g, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)[0]

# filter out contours that are too large and have length 4 (rectangular)
contours = [
    cnt for cnt in contours
    if cv2.contourArea(cnt) <= 500 and len(cnt) == 4
]

# fill kept contours with black on the canvas
cv2.drawContours(canvas, contours, -1, 0, -1)

imageio.imsave('filtered-contours.png', canvas)

# combine kept contours with red channel using '&' to bring back the "AAA"
# use '|' with the green channel to remove contour edges around the "BBB"
result = canvas & r | g

imageio.imsave('result.png', result)

r-channel.png

g-channel.png

filtered-contours.png

result.png

更新

这是基于您在chat中提供的另一个示例图像的更通用的解决方案：

import cv2
import numpy as np

img = cv2.imread('example.png')

result = np.ones(img.shape[:2], np.uint8) * 255
for channel in cv2.split(img):
    canvas = np.ones(img.shape[:2], np.uint8) * 255
    contours = cv2.findContours(255 - channel, cv2.RETR_LIST,
                                cv2.CHAIN_APPROX_SIMPLE)[0]
    # size threshold may vary per image
    contours = [cnt for cnt in contours if cv2.contourArea(cnt) <= 100]
    cv2.drawContours(canvas, contours, -1, 0, -1)
    result = result & (canvas | channel)

cv2.imwrite('result.png', result)

这里我不再过滤轮廓长度，因为这会在其他字符接触矩形时引起问题。图像的所有通道均用于使其与不同的颜色兼容。

从图像中删除颜色以仅保留文本

1 个答案: