强大的本地自适应软二值化！那就是我所说的。

Question

在拍摄一张纸时（例如使用手机相机），我得到以下结果（左图）（jpg下载here）。所需的结果（使用图像编辑软件手动处理）在右侧：

我想用openCV处理原始图像，以自动获得更好的亮度/对比度（以使背景更白）。

假设：图像具有A4纵向格式（在本主题中我们无需透视变形），并且纸页为白色，可能带有黑色或彩色的文本/图像。

到目前为止，我已经尝试过：

各种自适应阈值方法，例如高斯，OTSU（请参见OpenCV doc Image Thresholding）。通常可以在OTSU上很好地工作：
```
ret, gray = cv2.threshold(img, 0, 255, cv2.THRESH_OTSU + cv2.THRESH_BINARY)
```
，但仅适用于灰度图像，不适用于彩色图像。此外，输出是二进制（白色或黑色），我不希望：我更喜欢保留彩色非二进制图像作为输出
Histogram equalization
- 应用于Y（在RGB => YUV变换之后）
- 或应用于V（在RGB => HSV转换后），
根据此answer（Histogram equalization not working on color image - OpenCV）或此one（OpenCV Python equalizeHist colored image）的建议：
```
img3 = cv2.imread(f)
img_transf = cv2.cvtColor(img3, cv2.COLOR_BGR2YUV)
img_transf[:,:,0] = cv2.equalizeHist(img_transf[:,:,0])
img4 = cv2.cvtColor(img_transf, cv2.COLOR_YUV2BGR)
cv2.imwrite('test.jpg', img4)
```
或使用HSV：
```
img_transf = cv2.cvtColor(img3, cv2.COLOR_BGR2HSV)
img_transf[:,:,2] = cv2.equalizeHist(img_transf[:,:,2])
img4 = cv2.cvtColor(img_transf, cv2.COLOR_HSV2BGR)
```
不幸的是，结果非常糟糕，因为它会在本地（？）产生可怕的微对比度：

我也尝试使用YCbCr，它很相似。

我还尝试过tileGridSize至1的{{1}}到各种1000：

img3 = cv2.imread(f)
img_transf = cv2.cvtColor(img3, cv2.COLOR_BGR2HSV)
clahe = cv2.createCLAHE(tileGridSize=(100,100))
img_transf[:,:,2] = clahe.apply(img_transf[:,:,2])
img4 = cv2.cvtColor(img_transf, cv2.COLOR_HSV2BGR)
cv2.imwrite('test.jpg', img4)

但是结果也同样糟糕。

按照问题CLAHE (Contrast Limited Adaptive Histogram Equalization)的建议使用LAB颜色空间执行此CLAHE方法：

import cv2, numpy as np
bgr = cv2.imread('_example.jpg')
lab = cv2.cvtColor(bgr, cv2.COLOR_BGR2LAB)
lab_planes = cv2.split(lab)
clahe = cv2.createCLAHE(clipLimit=2.0,tileGridSize=(100,100))
lab_planes[0] = clahe.apply(lab_planes[0])
lab = cv2.merge(lab_planes)
bgr = cv2.cvtColor(lab, cv2.COLOR_LAB2BGR)
cv2.imwrite('_example111.jpg', bgr)

也给出了不好的结果。输出图像：

在每个通道上分别进行自适应阈值均衡或直方图均衡（R，G，B）不是一种选择，因为它会破坏色彩平衡，如{{3}所述}。

来自scikit-image在How to apply CLAHE on RGB color images的教程中的

“对比度拉伸” 方法：

图像被重新缩放以包括落在第2个百分点和第98个百分点内的所有强度

稍好一些，但仍远未达到预期的效果（请参见此问题上方的图片）。

TL; DR：如何使用OpenCV / Python对一张纸的彩色照片进行自动亮度/对比度优化？可以使用哪种阈值/直方图均衡/其他技术？？

Answer 1

强大的本地自适应软二值化！那就是我所说的。

出于不同的目的，我之前做过类似的事情，因此这可能无法完全满足您的需求，但希望对您有所帮助（我晚上也写了这段代码供个人使用，所以很难看）。从某种意义上讲，与您的案例相比，此代码旨在解决一个更一般的案例，在这种情况下，我们在后台可能会遇到很多结构性噪声（请参见下面的演示）。

此代码的作用？给定一张纸的照片，它将变白，以便可以完美打印。请参见下面的示例图片。

Teaser：：这就是采用这种算法后（前后）的页面外观。请注意，即使颜色标记注释也已消失，所以我不知道这是否适合您的用例，但是代码可能有用：

要获得完全干净的结果，您可能需要考虑一下过滤参数，但正如您所见，即使使用默认参数，效果也很好。

第0步：剪切图像使其紧贴页面

让我们假设您以某种方式执行了此步骤（在您提供的示例中似乎是如此）。如果您需要手动注释和重新绘制工具，请pm pm！ ^^此步骤的结果如下（我在这里使用的示例可能比您提供的示例难，尽管可能与您的情况不完全相符）：

由此，我们可以立即看到以下问题：

光照条件不均匀。这意味着所有简单的二值化方法均无效。我尝试了OpenCV中提供的许多解决方案，以及它们的组合，但没有一个起作用！

背景噪音很大。就我而言，我需要除去纸的网格以及从纸张另一面可透过薄片看到的墨水。 / li>

步骤1：伽玛校正

此步骤的目的是平衡整个图像的对比度（因为根据光照条件，您的图像可能会略微过度曝光/曝光不足）。

这乍看起来似乎是不必要的步骤，但它的重要性不可低估：从某种意义上说，它可以将图像归一化为相似的曝光分布，以便以后可以选择有意义的超参数（例如下一节的DELTA参数，噪声过滤参数，形态学参数等）

# Somehow I found the value of `gamma=1.2` to be the best in my case def adjust_gamma(image, gamma=1.2): # build a lookup table mapping the pixel values [0, 255] to # their adjusted gamma values invGamma = 1.0 / gamma table = np.array([((i / 255.0) ** invGamma) * 255 for i in np.arange(0, 256)]).astype("uint8") # apply gamma correction using the lookup table return cv2.LUT(image, table)

以下是伽玛调整的结果：

您可以看到它现在更多……“平衡”。没有此步骤，您将在后续步骤中手动选择的所有参数将变得不那么健壮！

第2步：自适应二值化以检测文本斑点

在此步骤中，我们将自适应地将文本斑点二进制化。稍后我将添加更多评论，但基本想法是：

我们将图像划分为大小为BLOCK_SIZE的块。诀窍是要选择足够大的大小，以便您仍能获得大块的文本和背景（即大于所拥有的任何符号），但又要足够小，以免遭受任何减轻条件的影响（例如，“大，但仍然本地”）。

在每个块内，我们进行局部自适应二值化：我们查看中间值并假设它是背景（因为我们选择的BLOCK_SIZE足够大，可以使其中的大部分成为背景）。然后，我们进一步定义DELTA-基本上只是一个阈值，“我们仍会将其视为中值而离中值有多远？”。

因此，功能process_image完成了任务。此外，您可以修改preprocess和postprocess函数以适应您的需要（但是，如您从上面的示例中看到的那样，该算法非常健壮，即它可以工作开箱即用，无需修改太多参数。

此部分的代码假定前景比背景要暗（即，纸上的墨水）。但是您可以通过调整preprocess函数来轻松地更改它：只需返回255 - image而不是image。

# These are probably the only important parameters in the
# whole pipeline (steps 0 through 3).
BLOCK_SIZE = 40
DELTA = 25

# Do the necessary noise cleaning and other stuffs.
# I just do a simple blurring here but you can optionally
# add more stuffs.
def preprocess(image):
    image = cv2.medianBlur(image, 3)
    return 255 - image

# Again, this step is fully optional and you can even keep
# the body empty. I just did some opening. The algorithm is
# pretty robust, so this stuff won't affect much.
def postprocess(image):
    kernel = np.ones((3,3), np.uint8)
    image = cv2.morphologyEx(image, cv2.MORPH_OPEN, kernel)
    return image

# Just a helper function that generates box coordinates
def get_block_index(image_shape, yx, block_size): 
    y = np.arange(max(0, yx[0]-block_size), min(image_shape[0], yx[0]+block_size))
    x = np.arange(max(0, yx[1]-block_size), min(image_shape[1], yx[1]+block_size))
    return np.meshgrid(y, x)

# Here is where the trick begins. We perform binarization from the 
# median value locally (the img_in is actually a slice of the image). 
# Here, following assumptions are held:
#   1.  The majority of pixels in the slice is background
#   2.  The median value of the intensity histogram probably
#       belongs to the background. We allow a soft margin DELTA
#       to account for any irregularities.
#   3.  We need to keep everything other than the background.
#
# We also do simple morphological operations here. It was just
# something that I empirically found to be "useful", but I assume
# this is pretty robust across different datasets.
def adaptive_median_threshold(img_in):
    med = np.median(img_in)
    img_out = np.zeros_like(img_in)
    img_out[img_in - med < DELTA] = 255
    kernel = np.ones((3,3),np.uint8)
    img_out = 255 - cv2.dilate(255 - img_out,kernel,iterations = 2)
    return img_out

# This function just divides the image into local regions (blocks),
# and perform the `adaptive_mean_threshold(...)` function to each
# of the regions.
def block_image_process(image, block_size):
    out_image = np.zeros_like(image)
    for row in range(0, image.shape[0], block_size):
        for col in range(0, image.shape[1], block_size):
            idx = (row, col)
            block_idx = get_block_index(image.shape, idx, block_size)
            out_image[block_idx] = adaptive_median_threshold(image[block_idx])
    return out_image

# This function invokes the whole pipeline of Step 2.
def process_image(img):
    image_in = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    image_in = preprocess(image_in)
    image_out = block_image_process(image_in, BLOCK_SIZE)
    image_out = postprocess(image_out)
    return image_out

结果是类似墨水迹线的漂亮斑点：

第3步：二进制化的“软”部分

具有覆盖符号的斑点以及更多一点，我们终于可以进行美白过程了。

如果我们仔细观察带有文字的纸片（尤其是带有手写文字的纸片）的照片，则从“背景”（白皮书）到“前景”（深色墨水）的转换并不清晰，但很渐进。本节中其他基于二值化的答案都提出了一个简单的阈值设置（即使它们是局部自适应的，但仍然是一个阈值），该阈值适用于打印的文本，但是在手写时会产生不那么精确的结果。

因此，本节的目的是我们要保留从黑色到白色的渐进透射效果，就像使用天然墨水的纸张的自然照片一样。这样做的最终目的是使其可打印。

主要思想很简单：像素值（在上述阈值之后）与局部最小值之间的差异越大，则它属于背景的可能性就越大。我们可以使用Sigmoid函数族来表达这一点，将其重新缩放到局部块的范围（这样，该函数就可以在整个图像中进行自适应缩放）。

# This is the function used for composing
def sigmoid(x, orig, rad):
    k = np.exp((x - orig) * 5 / rad)
    return k / (k + 1.)

# Here, we combine the local blocks. A bit lengthy, so please
# follow the local comments.
def combine_block(img_in, mask):
    # First, we pre-fill the masked region of img_out to white
    # (i.e. background). The mask is retrieved from previous section.
    img_out = np.zeros_like(img_in)
    img_out[mask == 255] = 255
    fimg_in = img_in.astype(np.float32)

    # Then, we store the foreground (letters written with ink)
    # in the `idx` array. If there are none (i.e. just background),
    # we move on to the next block.
    idx = np.where(mask == 0)
    if idx[0].shape[0] == 0:
        img_out[idx] = img_in[idx]
        return img_out

    # We find the intensity range of our pixels in this local part
    # and clip the image block to that range, locally.
    lo = fimg_in[idx].min()
    hi = fimg_in[idx].max()
    v = fimg_in[idx] - lo
    r = hi - lo

    # Now we use good old OTSU binarization to get a rough estimation
    # of foreground and background regions.
    img_in_idx = img_in[idx]
    ret3,th3 = cv2.threshold(img_in[idx],0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

    # Then we normalize the stuffs and apply sigmoid to gradually
    # combine the stuffs.
    bound_value = np.min(img_in_idx[th3[:, 0] == 255])
    bound_value = (bound_value - lo) / (r + 1e-5)
    f = (v / (r + 1e-5))
    f = sigmoid(f, bound_value + 0.05, 0.2)

    # Finally, we re-normalize the result to the range [0..255]
    img_out[idx] = (255. * f).astype(np.uint8)
    return img_out

# We do the combination routine on local blocks, so that the scaling
# parameters of Sigmoid function can be adjusted to local setting
def combine_block_image_process(image, mask, block_size):
    out_image = np.zeros_like(image)
    for row in range(0, image.shape[0], block_size):
        for col in range(0, image.shape[1], block_size):
            idx = (row, col)
            block_idx = get_block_index(image.shape, idx, block_size)
            out_image[block_idx] = combine_block(
                image[block_idx], mask[block_idx])
    return out_image

# Postprocessing (should be robust even without it, but I recommend
# you to play around a bit and find what works best for your data.
# I just left it blank.
def combine_postprocess(image):
    return image

# The main function of this section. Executes the whole pipeline.
def combine_process(img, mask):
    image_in = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    image_out = combine_block_image_process(image_in, mask, 20)
    image_out = combine_postprocess(image_out)
    return image_out

一些东西被注释，因为它们是可选的。 combine_process函数采用上一步中的掩码，并执行整个合成管道。您可以尝试与他们玩耍以获取您的特定数据（图像）。结果很整洁：

可能我会在此答案的代码中添加更多注释和解释。将整个内容（连同裁剪和变形代码一起）上传到Github。

Answer 2

可以分别使用alpha（α）和beta（β）调整亮度和对比度。表达式可以写为

OpenCV已经将其实现为cv2.convertScaleAbs()，因此我们可以将此功能与用户定义的alpha和beta值一起使用。

import cv2
import numpy as np
from matplotlib import pyplot as plt

image = cv2.imread('1.jpg')

alpha = 1.95 # Contrast control (1.0-3.0)
beta = 0 # Brightness control (0-100)

manual_result = cv2.convertScaleAbs(image, alpha=alpha, beta=beta)

cv2.imshow('original', image)
cv2.imshow('manual_result', manual_result)
cv2.waitKey()

但是问题是

如何获得彩色照片的自动亮度/对比度优化？

本质上，问题是如何自动计算alpha和beta。为此，我们可以查看图像的直方图。自动亮度和对比度优化计算alpha和beta，以便输出范围为[0...255]。我们计算累积分布以确定颜色频率小于某个阈值（例如1％）的位置，并剪切直方图的右侧和左侧。这给了我们最小和最大范围。这是裁剪前（蓝色）和裁剪后（橙色）的直方图的可视化。

要计算alpha，我们在裁剪后取最小和最大灰度范围，并将其与我们期望的输出范围255

相除

α = 255 / (maximum_gray - minimum_gray)

要计算beta，我们将其插入公式g(i, j)=0和f(i, j)=minimum_gray

g(i,j) = α * f(i,j) + β

解决后会得出这样的结果

β = -minimum_gray * α

对于您的图片，我们得到了

alpha 3.75

beta -311.25

您可能必须调整剪切阈值以优化结果。这是对其他图片使用1％阈值的一些示例结果

自动的亮度和对比度代码

import cv2
import numpy as np
from matplotlib import pyplot as plt

# Automatic brightness and contrast optimization with optional histogram clipping
def automatic_brightness_and_contrast(image, clip_hist_percent=1):
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

    # Calculate grayscale histogram
    hist = cv2.calcHist([gray],[0],None,[256],[0,256])
    hist_size = len(hist)

    # Calculate cumulative distribution from the histogram
    accumulator = []
    accumulator.append(float(hist[0]))
    for index in range(1, hist_size):
        accumulator.append(accumulator[index -1] + float(hist[index]))

    # Locate points to clip
    maximum = accumulator[-1]
    clip_hist_percent *= (maximum/100.0)
    clip_hist_percent /= 2.0

    # Locate left cut
    minimum_gray = 0
    while accumulator[minimum_gray] < clip_hist_percent:
        minimum_gray += 1

    # Locate right cut
    maximum_gray = hist_size -1
    while accumulator[maximum_gray] >= (maximum - clip_hist_percent):
        maximum_gray -= 1

    # Calculate alpha and beta values
    alpha = 255 / (maximum_gray - minimum_gray)
    beta = -minimum_gray * alpha

    '''
    # Calculate new histogram with desired range and show histogram 
    new_hist = cv2.calcHist([gray],[0],None,[256],[minimum_gray,maximum_gray])
    plt.plot(hist)
    plt.plot(new_hist)
    plt.xlim([0,256])
    plt.show()
    '''

    auto_result = cv2.convertScaleAbs(image, alpha=alpha, beta=beta)
    return (auto_result, alpha, beta)

image = cv2.imread('1.jpg')
auto_result, alpha, beta = automatic_brightness_and_contrast(image)
print('alpha', alpha)
print('beta', beta)
cv2.imshow('auto_result', auto_result)
cv2.waitKey()

具有以下代码的结果图像：

其他图像的结果阈值为1％

Answer 3

我认为这样做的方法是1）从HCL色彩空间中提取色度（饱和度）通道。（HCL比HSL或HSV更好。）仅颜色应具有非零饱和度，因此明亮和灰色阴影将是深色的。 2）使用otsu阈值化作为掩码的结果阈值。 3）将您的输入转换为灰度，然后应用局部区域（即自适应）阈值化。 4）将遮罩放到原始图像的alpha通道中，然后将局部区域阈值结果与原始图像进行合成，以便保留原始区域的彩色区域，其他任何地方都使用局部区域阈值结果。

对不起，我不太了解OpeCV，但这是使用ImageMagick的步骤。

请注意，通道的编号从0开始。（H = 0或红色，C = 1或绿色，L = 2或蓝色）

输入：

magick image.jpg -colorspace HCL -channel 1 -separate +channel tmp1.png

magick tmp1.png -auto-threshold otsu tmp2.png

magick image.jpg -colorspace gray -negate -lat 20x20+10% -negate tmp3.png

magick tmp3.png \( image.jpg tmp2.png -alpha off -compose copy_opacity -composite \) -compose over -composite result.png

添加：

这是Python Wand代码，可产生相同的输出结果。它需要Imagemagick 7和Wand 0.5.5。

#!/bin/python3.7

from wand.image import Image
from wand.display import display
from wand.version import QUANTUM_RANGE

with Image(filename='text.jpg') as img:
    with img.clone() as copied:
        with img.clone() as hcl:
            hcl.transform_colorspace('hcl')
            with hcl.channel_images['green'] as mask:
                mask.auto_threshold(method='otsu')
                copied.composite(mask, left=0, top=0, operator='copy_alpha')
                img.transform_colorspace('gray')
                img.negate()
                img.adaptive_threshold(width=20, height=20, offset=0.1*QUANTUM_RANGE)
                img.negate()
                img.composite(copied, left=0, top=0, operator='over')
                img.save(filename='text_process.jpg')

Answer 4

此方法应适合您的应用程序。首先，找到一个阈值，该阈值可以在强度直方图中很好地分隔分布模式，然后使用该值重新调整强度。

from skimage.filters import threshold_yen
from skimage.exposure import rescale_intensity
from skimage.io import imread, imsave

img = imread('mY7ep.jpg')

yen_threshold = threshold_yen(img)
bright = rescale_intensity(img, (0, yen_threshold), (0, 255))

imsave('out.jpg', bright)

我在这里使用Yen的方法，可以在this page上了解有关此方法的更多信息。

Answer 5

首先，我们将文本和颜色标记分开。这可以在具有色彩饱和度通道的色彩空间中完成。相反，我使用了一种受this paper启发的非常简单的方法：对于（浅色）灰色区域，min（R，G，B）/ max（R，G，B）的比率将接近1，而对于彩色区域。对于深灰色区域，我们得到0到1之间的任何值，但这无关紧要：这些区域进入颜色蒙版，然后按原样添加，或者它们不包含在蒙版中，并且对二值化后的输出有所贡献文本。对于黑色，我们使用0/0转换为uint8时变为0的事实。

灰度图像文本在本地经过阈值处理以生成黑白图像。您可以从this comparison或that survey中选择自己喜欢的技术。我选择了NICK技术，该技术可以很好地应对低对比度并且相当健壮，也就是说，参数k在-0.3到-0.1之间选择对于非常宽的条件非常有效，这对于自动处理非常有用。对于提供的示例文档，所选择的技术由于其照明相对均匀而没有太大的作用，但是为了应对照明不均匀的图像，应该使用局部阈值化技术。

最后一步，将颜色区域重新添加到二值化的文本图像中。

因此，此解决方案与@ fmw42的解决方案非常相似（所有想法都归功于他），只是使用了不同的颜色检测和二值化方法。

image = cv2.imread('mY7ep.jpg')

# make mask and inverted mask for colored areas
b,g,r = cv2.split(cv2.blur(image,(5,5)))
np.seterr(divide='ignore', invalid='ignore') # 0/0 --> 0
m = (np.fmin(np.fmin(b, g), r) / np.fmax(np.fmax(b, g), r)) * 255
_,mask_inv = cv2.threshold(np.uint8(m), 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
mask = cv2.bitwise_not(mask_inv)

# local thresholding of grayscale image
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
text = cv2.ximgproc.niBlackThreshold(gray, 255, cv2.THRESH_BINARY, 41, -0.1, binarizationMethod=cv2.ximgproc.BINARIZATION_NICK)

# create background (text) and foreground (color markings)
bg = cv2.bitwise_and(text, text, mask = mask_inv)
fg = cv2.bitwise_and(image, image, mask = mask)

out = cv2.add(cv2.cvtColor(bg, cv2.COLOR_GRAY2BGR), fg)

如果您不需要颜色标记，则可以对灰度图像进行二值化处理：

image = cv2.imread('mY7ep.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
text = cv2.ximgproc.niBlackThreshold(gray, 255, cv2.THRESH_BINARY, at_bs, -0.3, binarizationMethod=cv2.ximgproc.BINARIZATION_NICK)

强大的本地自适应软二值化！那就是我所说的。

第0步：剪切图像使其紧贴页面

步骤1：伽玛校正

第2步：自适应二值化以检测文本斑点

第3步：二进制化的“软”部分

使用OpenCV自动调整一张纸的彩色照片的对比度和亮度

5 个答案: