Question

我想将此图片二值化：

将它与tesseract-ocr一起使用。目前，我设法得到了这个：

但我需要只有文字的清晰图像，没有黑色背景部分，就像这样： imgur.com/KXQNErM

我目前的代码：

img = cv2.imread(path, 0)
blur = cv2.GaussianBlur(img, (3, 3), 0)
filtered = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 405, 1)
bitnot = cv2.bitwise_not(filtered)
cv2.imshow('image', bitnot)
cv2.imwrite("h2kcw2/out1.png", bitnot)
cv2.waitKey(0)
cv2.destroyAllWindows()

Answer 1

常规阈值可以产生良好的结果：

Result

img = cv2.imread(path, 0)
ret, thresh = cv2.threshold(img, 70, 255, cv2.THRESH_BINARY_INV)
cv2.imshow('image', thresh)
cv2.imwrite("h2kcw2/out1.png", thresh)
cv2.waitKey(0)
cv2.destroyAllWindows()

文本二值化

1 个答案: