我可以在Windows命令行中测试tesseract ocr吗?

时间:2014-10-08 07:42:04

标签: ocr tesseract python-tesseract

我是tesseract OCR的新手。我试图将图像转换为tif并运行它以查看在Windows中使用cmd从tesseract输出的内容,但我无法做到。你能帮助我吗?什么是命令使用?

这是我的示例图片:

enter image description here

1 个答案:

答案 0 :(得分:15)

最简单的tesseract.exe语法是tesseract.exe inputimage output-text-file。 这里的假设是tesseract.exe被添加到PATH环境变量中。 如果您的文本参数特别难以识别,则可以添加-psm N参数。

我发现常规语法(没有任何-psm开关)对你附加的图像效果很好,除非精度水平不够好。

请注意,非英语字符(例如处方旁边的符号)无法识别;我的默认安装仅包含英语培训数据。

以下是tesseract语法说明:

C:\Users\vish\Desktop>tesseract.exe
Usage:tesseract.exe imagename outputbase [-l lang] [-psm pagesegmode] [configfile...]

pagesegmode values are:
0 = Orientation and script detection (OSD) only.
1 = Automatic page segmentation with OSD.
2 = Automatic page segmentation, but no OSD, or OCR
3 = Fully automatic page segmentation, but no OSD. (Default)
4 = Assume a single column of text of variable sizes.
5 = Assume a single uniform block of vertically aligned text.
6 = Assume a single uniform block of text.
7 = Treat the image as a single text line.
8 = Treat the image as a single word.
9 = Treat the image as a single word in a circle.
10 = Treat the image as a single character.
-l lang and/or -psm pagesegmode must occur before anyconfigfile.

Single options:
  -v --version: version info
  --list-langs: list available languages for tesseract engine

这是图像的输出(注意:当我下载它时,它转换为PNG图像):

C:\Users\vish\Desktop>tesseract.exe ECL8R.png out.txt
Tesseract Open Source OCR Engine v3.02 with Leptonica

C:\Users\vish\Desktop>type out.txt.txt
1 Project Background

A prescription (R) is a written order by a physician or medical doctor to a pharmacist in the form of
medication instructions for an individual patient. You can't get prescription medicines unless someone
with authority prescribes them. Usually, this means a written prescription from your doctor. Dentists,

optometrists, midwives and nurse practitioners may also be authorized to prescribe medicines for you.

It can also be defined as an order to take certain medications.

A prescription has legal implications; this means the prescriber must assume his responsibility for the
clinical care ofthe patient.

Recently, the term "prescriptionΓÇ¥ has known a wider usage being used for clinical assessments,