Question

我一直试图在酒精瓶标签上执行OCR。我想检测字符然后识别它。我尝试使用API（Tesseract，ocr，space等），但效果不佳。我看到了很多相同的CRNN项目，但是他们使用了MJSynth数据集或类似的数据集。我不想使用所有酒精瓶的数据集，因为有很多酒精瓶，我想按字符进行操作。

我使用了74k char数据集并训练了一个模型，并使用轮廓来获取角色的边界框。但是该模型没有给出准确的结果。轮廓也不是连续的。

这是代码：

labels = ['0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z']
import cv2
import numpy as np
im = cv2.imread('img1.jpg',0)
ret,thresh1 = cv2.threshold(im,127,255,cv2.THRESH_BINARY)
contours, hierarchy = cv2.findContours(thresh1,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contours:
    x,y,w,h = cv2.boundingRect(cnt)
    #following if statement is to ignore the noises and save the images which are of normal size(character)
    if w > 18 and w < 90  and h > 30 :
       cropped = thresh1[y:y+h,x:x+w]    
       ret,thresh2 = cv2.threshold(cropped,127,255,cv2.THRESH_BINARY_INV)
       thresh2 = cv2.resize(thresh2,(32,32))
       img = image.img_to_array(thresh2)
       img = np.expand_dims(img,axis=0)
       result = model.predict(img)
       index = np.argmax(result)
       print(labels[index], end="")

任何与上述代码有关的帮助，或任何其他建议和不同的方法，将不胜感激！谢谢。

训练模型以检测和识别图像中的字符（非手写体）

0 个答案: