我需要一个应用程序,在我的屏幕上观察数字,然后用它进行计算,所以经过几天研究最好和最简单的方法我发现这个视频 (https://www.youtube.com/watch?v=Kjdu8SjEtG0)导致我在Visual Basic 2010 Express上使用OCR和EMGU-Tesseract。我完全理解了视频,并在视频描述中对代码进行了自己的修改。
我导入了:
Imports Emgu.CV
Imports Emgu.Util
Imports Emgu.CV.OCR
Imports Emgu.CV.Structure
然后我根据原始代码制作:
Dim OCRz As Tesseract = New Tesseract("tessdata", "eng", Tesseract.OcrEngineMode.OEM_TESSERACT_ONLY)
Dim picStc1 As Bitmap = New Bitmap(149, 28)
Dim gfxSTK1 As Graphics = Graphics.FromImage(picStc1)
Dim picNam1 As Bitmap = New Bitmap(149, 28)
Dim gfxNAM1 As Graphics = Graphics.FromImage(picNam1)
Private Sub Timer1_Tick(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Timer1.Tick
gfxSTK1.CopyFromScreen(New Point(Me.Location.X + Stk1.Location.X + 5, Me.Location.Y + Stk1.Location.Y + 24), New Point(0, 0), picStc1.Size)
Stk1.Image = picStc1
gfxNAM1.CopyFromScreen(New Point(Me.Location.X + Nome1.Location.X + 5, Me.Location.Y + Nome1.Location.Y + 24), New Point(0, 0), picNam1.Size)
Nome1.Image = picNam1
当我按下按钮时,我得到了这个:
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
OCRz.Recognize(New Image(Of Bgr, Byte)(picStc1))
BOXSTK1.Text = OCRz.GetText
OCRz.Recognize(New Image(Of Bgr, Byte)(picNam1))
BoxNAME1.Text = OCRz.GetText
我现在通过OCR引擎从PictureBoxes(picStc1)和(picNam1)读取文本,并在按下按钮后在RichTextBoxes(BoxSTK1)和(NAME1)上写入。
RichTextBox(BoxSTK1)上的数字带有逗号和其他符号,但我只想抓取数字。所以我发现了这个(https://code.google.com/p/tesseract-ocr/wiki/FAQ#How_do_I_recognize_only_digits?)但是我无法在项目中实现它,对此有任何帮助吗?
(我正在使用Emgu 2.9.0.1922,不知道如何查看Tesseract的版本)
答案 0 :(得分:0)
这个基于数字的"白名单"似乎是初始化对象时设置的内容。 Check out this question
所以你需要改变,
Dim OCRz As Tesseract = New Tesseract("tessdata", "eng", Tesseract.OcrEngineMode.OEM_TESSERACT_ONLY)
对于这样的事情,
Dim OCRz As Tesseract = New Tesseract()
OCRz.SetVariable("tessedit_char_whitelist", "0123456789")
OCRz.init("tessdata", "eng", false)
答案 1 :(得分:0)
首先使用以下方法定义白名单:
OCRz.SetVariable("tessedit_char_whitelist", ",$0123456789")
然后像这样转换字符串并打印出来:
RichTextBox1.Text = Convert.ToString(OCRz.GetText).Replace("$", "").Replace(",", "")
最后我们得到了这个:
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
OCRz.SetVariable("tessedit_char_whitelist", ",$0123456789")
OCRz.Init("tessdata", "eng", False)
OCRz.Recognize(New Image(Of Bgr, Byte)(pic))
RichTextBox1.Text = Convert.ToString(OCRz.GetText).Replace("$", "").Replace(",", "")
我会再次感谢吉米史密斯的快速答案,非常有用,请注意自己投票给这个人;)
答案 2 :(得分:0)
Dim OCRz As Tesseract =
New Tesseract("tessdata", "eng",Tesseract.OcrEngineMode.OEM_DEFAULT)
Dim pic As Bitmap = New Bitmap(270, 100)
Dim gfx As Graphics = Graphics.FromImage(pic)