Question

我正在尝试在python中使用pytesseract但总是以错误结束：

＆＃34; TesseractNotFoundError：没有安装tesseract，或者它不在您的路径中＃34;

pytesseract和tesseract安装在系统中。我是python的新手，所以如果有人可以帮助我，我将非常感激

Answer 1

我尝试像其他人提到的那样添加到path变量，但是仍然收到相同的错误。有效的方法是将其添加到我的脚本中：

pytesseract.pytesseract.tesseract_cmd = r“ C：\ Program Files （x86）\ Tesseract-OCR \ tesseract.exe“

Answer 2

我收到此错误是因为我在pytesseract上安装了pip，但是却忘记了在apt上安装了它。

在Ubuntu上：

sudo apt update
sudo apt install tesseract-ocr
sudo apt install libtesseract-dev

在Mac上：

brew install tesseract

Answer 3

您的机器可能缺少tesseract-ocr。请在此处查看安装说明：https://github.com/tesseract-ocr/tesseract/wiki

在Mac上，您只需使用自制软件安装：

brew install tesseract

之后它应该运行正常

Answer 4

我正在Mac OS上运行，并通过brew安装了tesseract，所以这是我的看法。由于pytesseract就是您可以从python访问tesseract的方式，因此您必须指定tesseract在计算机上的位置。

对于Mac OS

尝试查找tesseract.exe所在的位置-如果您使用的是它安装的酿造，在终端上使用：

>brew list tesseract

这应该列出您的tesseract.exe所在的位置，或多或少像

> /usr/local/Cellar/tesseract/3.05.02/bin/tesseract

Then following their instructions：

pytesseract.pytesseract.tesseract_cmd = r'<full_path_to_your_tesseract_executable>'

pytesseract.pytesseract.tesseract_cmd = r'/ usr / local / Cellar / tesseract / 3.05.02 / bin / tesseract'

应该可以解决问题！

Answer 5

在Windows 10 OS环境下，以下方法适用于我：

https://github.com/tesseract-ocr/tesseract/wiki 下载tesseract并安装它。 Windows版本在这里可用： https://github.com/UB-Mannheim/tesseract/wiki
从C：\ Users \ User \ Anaconda3 \ Lib \ site-packages \ pytesseract查找脚本文件pytesseract.py并将其打开。将以下代码从tesseract_cmd = 'tesseract'更改为：tesseract_cmd = 'D:/Program Files (x86)/Tesseract-OCR/tesseract.exe'
您可能还需要添加环境变量D:/Program Files (x86)/Tesseract-OCR/

希望它对您有用！

Answer 6

在Jupyter Notebook中实际上对我有用的一件简单事情是，在pytesseract.pytesseract.tesseract_cmd路径中使用双反斜杠，而不是单个反斜杠：

pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe'

Answer 7

我遇到了同样的问题。我希望您已安装here并已完成pip install pytesseract。

如果一切正常，您应该看到路径C:\Program Files (x86)\Tesseract-OCR where tesseract.exe可用。

添加Path变量对我没有帮助，实际上我在环境变量中添加了名为tesseract的新变量，其值为C:\Program Files (x86)\Tesseract-OCR\tesseract.exe。

在命令行中键入tesseract现在应该通过提供使用信息按预期工作。您现在可以使用pytesseract（不要忘记在运行之前重新启动python内核！）：

import pytesseract
from PIL import Image

value=Image.open("text_image.png")
text = pytesseract.image_to_string(value, config='')    
print("text present in images:",text)

享受！

Answer 8

您可以使用以下链接下载tesseract-ocr设置

Tesseract for windows

然后在名称为C：\ Program Files（x86）\ Tesseract-OCR \ tesseract.exe的环境变量中添加名称为tesseract的新变量

Answer 9

我也面临同样的问题。我只是使用此命令对我有帮助。

sudo apt install tesseract-ocr

请注意，这仅适用于Ubuntu。
sudo是Unix专用命令（Linux，Mac，Rasbian等），而apt是Ubuntu专用的。

Answer 10

从https://github.com/UB-Mannheim/tesseract/wiki安装tesseract，并将tesseract.exe的路径添加到路径环境变量。

Answer 11

设置路径的步骤很少

1：转到此“ https://github.com/UB-Mannheim/tesseract/wiki”

2：下载最新的安装程序

3：安装

4：在系统变量中设置路径，例如“ C：\ Program Files \ Tesseract-OCR”或      “ C：\ ProgramFiles（x86）\ Tesseract-OCR”

5：打开CMD类型“ tesseract”和一些输出，除了“未重新输入类型错误”

Answer 12

注意：仅适用于Windows

我今天遇到了这个问题，这里提到的所有答案都对我有所帮助，但是我个人不得不花很多时间来解决它。因此，让我以非常简单的形式提出解决方案的方法来帮助所有其他人：

下载64位可执行文件（如果您的计算机为32位，则为32位）位{.3}}中的exe。

（文件名是tesseract-ocr-w64-setup-v5.0.0.20190526 （alpha））
安装。让它自己安装在默认的C目录中。
现在转到您的环境变量（只需在开始菜单中搜索它即可到达该变量，或转到Control Panel > System > Advanced System Settings > Environment Variables）

a）选择“路径”，然后编辑它。单击“新建”，然后添加安装路径（通常为C:\Program Files\Tesseract-OCR\）

现在您将不会收到错误消息！

Answer 13

对于Mac：

安装Pytesseract（ pip install pytesseract 应该可以）
安装Tesseract ，但只能通过自制软件，以某种方式无法安装pip 。（ brew install tesseract ）
获取在您的设备上进行Tesseract的brew安装的路径（ brest list tesseract ）
将路径添加到您的代码中，而不是sys路径中。使用pytesseract.pytesseract.tesseract_cmd ='<步骤3中接收的路径>'-将路径与代码一起添加（例如 pytesseract.pytesseract.tesseract_cmd ='/usr/local/Cellar/tesseract/4.0。 0_1 / bin / tesseract'）

这应该很好。

Answer 14

小错误-我知道我必须打开/关闭cmd才能反映出更新的路径。使用Jupyter Notebook，我必须关闭客户端并重新初始化。

Answer 15

以下三个命令将满足需要：

sudo apt update
# This will update your packages
sudo apt install tesseract-ocr
# This will install OCR
sudo apt install libtesseract-dev
# This will add it as development dependency

Answer 16

很可能您安装了不同版本的Python，请确保已安装的Tesseract使用相同的Python版本。

which pip3显示pip3安装的路径，which python3显示Python安装的对应路径。

确保这两个相同。

Answer 17

我当前正在使用Windows，需要开发PDF解析器，但是仅通过sysdm.cpl添加新的环境变量是行不通的。对于其他Windows用户，强烈建议也将C:\Program Files (x86)\Tesseract-OCR添加到您的profile.ps1中（如果使用的是Powershell）。

Answer 18

我也面临同样的问题，只需将C:\Program Files (x86)\Tesseract-OCR添加到路径变量中即可。如果仍然无效，请在新行中将C:\Program Files (x86)\Tesseract-OCR\tessdata添加到路径变量中。并且在添加路径变量后不要忘记重新启动计算机。

Answer 19

当我尝试使用pytesseract制作文本提取程序时，我也遇到了相同的错误，但是解决方案在pypi网站上的 pytesseract 安装说明中：pytesseract 有很多避免错误的方法，但是，在方法 pytesseract.image_to_string 中添加一个参数对我来说很方便，例如

tessdata_dir_config = "/usr/share/tesseract-ocr/4.00/tessdata"
output = pytesseract.image_to_string(image, lang='eng', config=tessdata_dir_config)

Answer 20

当当前目录位于与tesseract安装位置不同的驱动器上时，这会在Windows下（至少在tesseract 3.05版中）发生。

tesseract中的某些内容期望数据文件位于\ Program Files ...（而不是C：\ Program Files）中。因此，如果您与tesseract不在同一个驱动器号上，它将失败。如果能够解决该问题，那就是在执行tesseract之前临时将驱动器（仅在Windows下）更改为tesseract安装驱动器，然后再进行更改，就可以解决此问题。您的示例：您可以将yourmodule_python.py复制到“ C /程序文件（x86）/ Tesseract-OCR /”并运行！

Answer 21

您要导入

吗？

from tesseract import image_to_string

不要从pytesseract导入

Answer 22

如果您使用的是Linux，只需运行以下命令，

onClick

然后运行此

label {
  pointer-events: none;
}
    
button {
  pointer-events: initial;
}

Answer 23

在Ubuntu的烧瓶Web框架下，这应该可以工作

pytesseract.pytesseract.tesseract_cmd = r"/usr/bin/tesseract"
img = Image.open(picture_name)
print(pytesseract.image_to_string(img))

Answer 24

对我来说，它通过放置单引号起作用

pytesseract.pytesseract.tesseract_cmd =r'C:/Program Files/Tesseract-OCR/tesseract.exe'

实际上放在双引号内是自动插入不需要的字符

Tesseract未找到错误

24 个答案: