应用错误收集

我在PyTesseract的python中使用Tesseract。我的目标是检测屏幕截图上的字符。屏幕截图上的文本完全对齐，二值化效果很好。但是Tesseract的检测率非常低。

我已经尝试过：

二值化前后的高斯滤波器
升级
不同的手动选择阈值
没有阈值
Tesseract 4.0和5.0 alpha

这是我的Python脚本：

from skimage import io
import pytesseract

im = io.imread("screenshots/test.png", as_gray=True)

thresholded = im > .30
thresholded += im < 0.02

print(pytesseract.image_to_string(thresholded, lang="eng"))

图片：https://imgur.com/a/ICp3mHu

结果：

“迪兰”
“自由流动的轨道”
“交通限制”
“ PE Taare的人”

Tesseract无法检测到非常简单的二值化图像

0 个答案: