Question

我想从图像中读取文本，并在图像中用其各自的坐标标记每个单词。因此，我提取了坐标，并对每个单词都有一个ROI（矩形）。现在，我正在运行一个for循环，以在每个ROI（包含图像的每个单词）中（每个图像包含150个）运行Pytesseract。大约需要40-50秒。有没有办法使其更快？

我先通过OpenCV找到轮廓，然后找到围绕这些轮廓的边界框来找到坐标。每个单词的坐标基本上是包围单词的边界框的左上角和右下角坐标。

图像尺寸为1300 x 1800时，精度非常好。将其降低会影响结果。我对图像进行了一些预处理，以将其转换为灰度并应用阈值以提高检测率。

results = []

for c in contours:

    # compute the bounding box of the contour
    # (x,y) be the top-left coordinate and (w,h) be width and height

    (x, y, w, h) = cv2.boundingRect(c)

    # extract the ROI from the image and draw a bounding box

    roi = image[y:y + h, x:x + w].copy()
    cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 1)

    #convert the image area defined by roi to text

    config = ("-l eng --oem 1 --psm 1")
    text = pytesseract.image_to_string(roi, config = config)  

    results.append((x, y, x + w, y + h, text))

有没有一种方法可以使Pytesseract在图像中具有大约150 ROI的速度更快？我为图片中的每个特定单词赋予ROI

0 个答案: