我几乎完成了一个成功的程序,因为代码使用了一个示例文件,但是我无法编辑我的仪表的照片以便OCR工作。
我发现我的输出图像非常接近工作模式,但是我不知道我还能对图像做些什么才能让它工作。
这是我的代码:
import pytesseract
import Image
import sys
from PIL import Image
from PIL import ImageFilter
import PIL
import PIL.ImageOps
image_file_ocr = 'ocr_output.jpg'
image_file = 'image_original.jpg'
#image_file = 'ocr2.jpg'
#image_file = 'sample1.jpg'
#image_file = 'sample2.jpg'
#image_file = 'sample3.jpg'
#image_file = 'sample4.jpg' # texto largo
#image_file = 'sample5.jpg' #image_text = "1234567890"
print image_file
# LOAD THE IMAGE
#image = Image.open('sample5.jpg')
image = Image.open(image_file) # open colour image
image = image.convert('L') # convert image to monochrome - this works
#image = image.convert('1') # convert image to black and white
image = image.rotate(-90)
# EDIT THE IMAGE
w, h = image.size
#image = image.crop((0, 30, w, h-30))
image = image.crop((350, 680, 1100, 770))
image.filter(ImageFilter.SHARPEN)
image = PIL.ImageOps.invert(image)
image.save(image_file_ocr,'jpeg')
# PROCESS THE IMAGE
print "\n\nProcessing image: " + image_file_ocr
image = Image.open(image_file_ocr)
print "Process method 1:"
text = pytesseract.image_to_string(image, config='outputbase digits')
print text
print "Process method 2:"
text = pytesseract.image_to_string(image)
print text
以下图片正常工作
答案 0 :(得分:1)
您可以考虑使用模式\d\d\d\d\d\d\d\d
(8位数)添加config user file。另外请注意默认的page segmentation method:
默认情况下,Tesseract在分割图像时需要一页文本。如果您只是想要使用
-psm
参数尝试OCR小区域尝试不同的分段模式。请注意,在文本过于紧缩的情况下添加白色边框也会有所帮助,请参阅问题398.
截至3.04:
0 Orientation and script detection (OSD) only.
1 Automatic page segmentation with OSD.
2 Automatic page segmentation, but no OSD, or OCR.
3 Fully automatic page segmentation, but no OSD. (Default)
4 Assume a single column of text of variable sizes.
5 Assume a single uniform block of vertically aligned text.
6 Assume a single uniform block of text.
7 Treat the image as a single text line.
8 Treat the image as a single word.
9 Treat the image as a single word in a circle.
10 Treat the image as a single character.
因此,您可能需要-psm 7
来裁剪图片。
另请参阅this答案,了解如何应用过滤器。