应用错误收集

我正在使用tesseract从图像中提取文本。每当我置信度低时，我都会再次为该词运行tesseract。

但是在某些情况下，字符之间会被打断。我尝试了不同的组合，但没有得到任何结果。像图片

img=img.resize([img.width*2,img.height*2])
img=ImageEnhance.Brightness(img).enhance(3.0)    
pytesseract.image_to_data(img,output_type=Output.DICT,config='--psm 10')

此外，许多单词之间有黑点。有没有可以改善黑点的库，或者我只需要使用opencv。

Tesseract读错字符

0 个答案: