强制tesseract-ocr识别单个字符?

时间:2015-02-25 03:57:09

标签: python tesseract

enter image description here

    api = tesseract.TessBaseAPI()
    api.SetOutputName("output")

    api.Init(".","eng",tesseract.OEM_DEFAULT)
    api.SetVariable("tessedit_char_whitelist", "0123456789")
    api.SetVariable("tessedit_pageseg_mode", "7")

    pixImage = tesseract.pixRead('img.jpg')

    api.SetImage(pixImage)

    outText = api.GetUTF8Text()
    answer = outText.replace("\n", "").replace(" ", "")

    print answer

上面的图像,即' 1',被识别为' 47'通过tesseract已经设置为api.SetVariable("tessedit_pageseg_mode", "7")。如何强制它识别图像中的单个字符?

P.S。:python 2.7.3 + tesseract 3.0.2

0 个答案:

没有答案