我尝试用开放式简历来做到这一点。该代码适用于高质量的图像,但需要帮助从该图像中提取
from PIL import Image
import pytesseract
import argparse
import os
import cv2
import numpy as np
img = cv2.imread(r"/home/ubuntu/xyz/xyz.jpg")
img = cv2.resize(img, None, fx=1.5, fy=1.5, interpolation=cv2.INTER_CUBIC)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
kernel = np.ones((1, 1), np.uint8)
img = cv2.dilate(img, kernel, iterations=1)
img = cv2.erode(img, kernel, iterations=1)
img = cv2.GaussianBlur(img, (5, 5), 0)
img = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)\[1\]
# Save the filtered image
cv2.imwrite(r"/home/ubuntu/xyz/rr.jpg", img)
# Read text with tesseract for python
result = pytesseract.image_to_string(img, lang="eng")
result
答案 0 :(得分:0)
在这种情况下为什么需要img = cv2.GaussianBlur(img, (5, 5), 0)
erosion
大窗口(5,5)
我认为您可以在外部设置白色边框,而不用调整图像大小,
并且您可以使用<gfe:replicated-region id="someRegion"
shortcut="REPLICATE_PERSISTENT" concurrency-level=100
persistent="true" disk-synchronous="true" statistics="true">
<gfe:eviction action="OVERFLOW_TO_DISK" type="ENTRY_COUNT"
threshold=1000></gfe:eviction>
</gfe:replicated-region>
技术来消除图像中的噪波