Question

我正在尝试对发票进行文本识别。

import pytesseract
from pytesseract import Output
import cv2

pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files/Tesseract-OCR/tesseract.exe'

img = cv2.imread('bill_copy.jpg')
d = pytesseract.image_to_data(img, output_type=Output.DICT)
n_boxes = len(d['level'])
for i in range(n_boxes):
    (x, y, w, h) = (d['left'], d['top'], d['width'], d['height'])
    img = cv2.rectangle(img, (x, y), (x + w, y + h), (0, 0, 255), 2)

cv2.imshow(img, 'img')

当我运行它时，我得到 enter image description here

Answer 1

x,y,w,h 的参数是每个分割字符的数组，但是在循环中它是一个一个地绘制矩形。

所以你需要在每个循环中为那些参数(x, y, w, h) 发送一个整数。

您的代码中有很多错误。正确的代码应该是这样的：

import pytesseract
from pytesseract import Output
import cv2

pytesseract.pytesseract.tesseract_cmd = r'C:/Program Files/Tesseract-OCR/tesseract.exe'

img = cv2.imread('bill_copy.jpg')
d = pytesseract.image_to_data(img, output_type=Output.DICT)
n_boxes = len(d['level'])
(x, y, w, h) = (d['left'], d['top'], d['width'], d['height'])

for i in range(n_boxes):
    img = cv2.rectangle(img, (x[i], y[i]), (x[i] + w[i], y[i] + h[i]), (0, 0, 255), 2)

cv2.imshow('img',img)
cv2.waitKey(0)

Answer 2

您的代码中的问题在于以下语句：

(x, y, w, h) = (d['left'], d['top'], d['width'], d['height'])

您需要获取每个区域的第 i 个值

(x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])

问题应该解决

类型错误：需要一个整数（得到类型元组）<python> <OpenCV> <tesseract>

2 个答案: