Question

我在Python中使用以下代码从图像中提取文本，

def read_image_data(request):
    import cv2
    import pytesseract

    pytesseract.pytesseract.tesseract_cmd = "C:/Program Files/Tesseract-OCR/tesseract.exe"
    img = cv2.imread("image_path")
    height, width = img.shape[0:2]
    startCol = 345  # x1 Left
    startRow = 107  # y1 Top
    endCol = 389  # x2 Right
    endRow = 135  # y2 Bottom

    croppedImage = img[startRow:endRow, startCol:endCol]
    text = pytesseract.image_to_string(croppedImage)

    gray = cv2.cvtColor(croppedImage, cv2.COLOR_BGR2GRAY)
    ret, threshold = cv2.threshold(gray, 55, 255, cv2.THRESH_BINARY)
    print(pytesseract.image_to_string(threshold))
    print(text)

但是输出不正确。

输入文件为±0.1％，收到的输出为 201％，而不是±0.1％。

输入文件为±50 ppm / K ，收到的输出为 +50 ppmik ，而不是±50 ppm / K

输入文件为 10至100k，收到的输出为 1010332ka。，而不是 10至100k < / p>

要从图像中检索正确的字符，需要进行哪些代码更改？

Python pytesseract从图像中错误地提取数据

0 个答案: