无法在tesseract中提取白色文本

时间:2018-06-14 09:04:46

标签: python-3.x tesseract opencv3.0 python-tesseract

我正在开发一个应用程序,使用 OpenCV和Tesseract 从图像中提取文本。当图像包含白色文本以及图像混合了白色和其他颜色时,我卡在了一个的地方。

如果文本不是白色,我很容易提取文本,但是如果文本是白色,则它不起作用。我正在使用Tesseract从文本中提取数据。 下面是我传递给tesseract之前的图像处理代码:

image = cv2.imread(imgPath)
filename = getFileName()

img2gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
#cv2.imwrite('python_scripts/temp/img2gray_' + filename, img2gray)

""" below removing shadow from image """
rgb_planes = cv2.split(img2gray)

result_planes = []
result_norm_planes = []
for plane in rgb_planes:
    dilated_img = cv2.dilate(plane, np.ones((1, 1), np.uint8))
    bg_img = cv2.medianBlur(dilated_img, 21)
    diff_img = 255 - cv2.absdiff(plane, bg_img)
    norm_img = diff_img
    norm_img = cv2.normalize(diff_img, norm_img, alpha=0, beta=255, norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_8UC1)
    result_planes.append(diff_img)
    result_norm_planes.append(norm_img)

result = cv2.merge(result_planes)
result_norm = cv2.merge(result_norm_planes)
img2gray = result


origImage = image.copy()

_, new_img = cv2.threshold(img2gray, 80, 255, cv2.THRESH_BINARY)  # for black text , cv.THRESH_BINARY_INV

# to manipulate the orientation of dilution , large x means horizonatally dilating  more, large y means vertically dilating more
kernel = cv2.getStructuringElement(cv2.MORPH_CROSS, (13, 13))
dilated = cv2.dilate(new_img, kernel, iterations=9)  # dilate , more the iteration more the dilation
cv2.imwrite('python_scripts/temp/dilated_' + filename, dilated)

_, contours, _ = cv2.findContours(dilated, cv2.RETR_LIST, cv2.CHAIN_APPROX_NONE)  # get contours

(imgH, imgW) = img2gray.shape[:2]
#logger.info("imgH: " + str(imgH) + ", imgW: " + str(imgW))
ctr = 0


contoursSort = {}

for contour in contours:
    # to sort contours from top to bottom
    [x, y, w, h] = cv2.boundingRect(contour)
    contoursSort[y] = contour

orderedContours = collections.OrderedDict(sorted(contoursSort.items()))

for key, contour in orderedContours.items():
    # get rectangle bounding contour
    (x, y, w, h) = cv2.boundingRect(contour)

    # Don't plot small false positives that aren't text
    if w < 35 and h < 35:
        continue

    length = x + w
    breadth = y + h
    area = length * breadth
    cv2.rectangle(origImage, (x, y), (x + w, y + h), (255, 255, 0), 2)
    # don't consider contour which is touching the border
    if x != 0 and y != 0 and x != imgW and y != imgH:
        #logger.info("x: " + str(x) + ", y: " + str(y) + ", length: " + str(length)
        #   + ", breadth: " + str(breadth) + ", area: " + str(area))

        croppedImg = origImage[y:(y + h), x:(x + w)]
        croppedGray = cv2.cvtColor(croppedImg, cv2.COLOR_BGR2GRAY)

        ctr = ctr + 1

        th3 = cv2.adaptiveThreshold(croppedGray,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY,75,10)
        #th3_height, th3_width = th3.shape[:2]
        #th3 = cv2.resize(th3,(2*th3_width, 2*th3_height), interpolation = cv2.INTER_CUBIC)
        th3 = cv2.GaussianBlur(th3,(5,5),0)

        tessImgPath = 'python_scripts/temp/th3_' + str(ctr) + "_" + filename
        cv2.imwrite(tessImgPath, th3)

        tessText = runTesseract(tessImgPath)
        os.remove(tessImgPath)
        logger.info("tess text: " + str(tessText))
  

请指导我如何通过OpenCV处理图像以便两者兼顾   可以一次性提取白色和其他颜色的文本。

我现在尝试的一种方法是在图像中获得白色部分,如果它小于其他颜色,则执行bitwise_xor图像,然后将其传递给Tesseract。

sample image sample image

提前致谢

1 个答案:

答案 0 :(得分:0)

我也在研究文本提取,其中 tesseract 无法提取多种颜色的文本。

白色不是问题。我尝试对您的图像进行提取。 Tesseract 给出了第二张图片的准确输出,因为文本只包含单一颜色。

但是第一张图像的结果并不好,因为一半的文本是黑色的,另一半是白色的,带有灰色轮廓。所以试试这个以获得全白和黑色图像:

fruits["Mango"] = { price : 350 };

The result is here

因此,在从中提取文本后,它将白色文本的灰色轮廓变为黑色。 最终结果是

import cv2
import numpy as np

# read image
img = cv2.imread("XDJFq.jpg")

# convert img to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# do adaptive threshold on gray image
thresh = cv2.adaptiveThreshold(gray, 250, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 11)

# write results to disk
cv2.imwrite("Full_white.jpg", thresh)

# display it
cv2.imshow("thresh", thresh)
cv2.waitKey(0)