Question

当我尝试通过python查找图片中的中文单词时，出现以下错误：（顺便说一句，我在tessdata目录中已经有“ chi_sim.traineddata”训练文件，并成功尝试了查找图片中的英语句子，所以这个错误真的使我感到困惑。）

*C:\Users\Lenovo\AppData\Local\Programs\Python\Python37-32\python.exe E:/PKU1.3/python_math/set_for_recognition.py
Traceback (most recent call last):
  File "E:/PKU1.3/python_math/set_for_recognition.py", line 5, in <module>
    text=pytesseract.image_to_string(Image.open('climb_high.jpeg'),lang='chi_sim')
  File "C:\Users\Lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pytesseract\pytesseract.py", line 295, in image_to_string
    return run_and_get_output(*args)
  File "C:\Users\Lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pytesseract\pytesseract.py", line 203, in run_and_get_output
    run_tesseract(**kwargs)
  File "C:\Users\Lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pytesseract\pytesseract.py", line 179, in run_tesseract
    raise TesseractError(status_code, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (3221225477, '')*

Answer 1

实际上，由于错误代码3221225477-> 0xC0000005：ACCESS_VIOLATION意味着Tesseract已崩溃（来自here），因此更改Tesseract的版本可能会对您有所帮助。

在4.00（测试版）和3.02中会出现此问题，3.05很好（我使用Windows 7）。

希望这会有所帮助。

Answer 2

出现此错误是因为我的UZN文件超出了图像区域。我修补了pytesseract.py（print(' '.join(cmd_args))中的run_tesseract()），这引发了声明错误。

Answer 3

我认为这个问题是TRAINEDDATA引起的。

我曾经在Windows 7上使用TESSERACT开发OCR项目。

现在，我切换到Windows10。出现此问题。

但是，我发现此问题与您的TRAINEDDATA有关，

如果我使用在Windows 7上训练过的TRAINEDDATA，则可以正常运行而不会出现任何错误消息。

引发“ pytesseract.pytesseract.TesseractError：（3221225477，”）”

3 个答案: