我正在尝试从图像中提取文本。目前我得到空字符串作为输出。下面是我的 pytesseract 代码,尽管我也对 Keras OCR 持开放态度:-
from PIL import Image
import pytesseract
path = 'captcha.svg.png'
img = Image.open(path)
captchaText = pytesseract.image_to_string(img, lang='eng', config='--psm 6')
我不确定如何使用 svg 图像,所以我将它们转换为 png。下面是一些示例图片:-
编辑 1 (2021-05-19): 我可以使用 cairosvg 将 svg 转换为 png。仍然无法阅读验证码文本
Edit 2 (2021-05-20): Keras OCR 也不会为这些图像返回任何内容
答案 0 :(得分:0)
keras-ocr
不工作或不返回任何内容的原因是灰度图像(因为我发现它可以正常工作)。见下文:
from PIL import Image
a = Image.open('/content/gD7vA.png') # return none by keras-ocr,
a.mode, a.split() # mode 1 channel + transparent layer / alpha layer (LA)
b = Image.open('/content/CYegU.png') # return result by keras-ocr
b.mode, b.split() # mode RGB + transparent layer / alpha layer (RGBA)
在上面,a
是您在问题中提到的文件;正如它所示,它必须引导,例如灰度和透明层。 b
是我转换为 RGB
或 RGBA
的文件。透明层已经包含在您的原始文件中,我没有删除它,但如果需要,保留其他方式似乎没有用。简而言之,要使您的输入在 keras-ocr
上工作,您可以先将文件转换为 RGB
(或 RGBA
)并将它们保存在磁盘上。然后将它们传递给 ocr。
# Using PIL to convert one mode to another
# and save on disk
c = Image.open('/content/gD7vA.png').convert('RGBA')
c.save(....png)
c.mode, c.split()
('RGBA',
(<PIL.Image.Image image mode=L size=150x50 at 0x7F03E8E7A410>,
<PIL.Image.Image image mode=L size=150x50 at 0x7F03E8E7A590>,
<PIL.Image.Image image mode=L size=150x50 at 0x7F03E8E7A810>,
<PIL.Image.Image image mode=L size=150x50 at 0x7F03E8E7A110>))
完整代码
import matplotlib.pyplot as plt
# keras-ocr will automatically download pretrained
# weights for the detector and recognizer.
pipeline = keras_ocr.pipeline.Pipeline()
# Get a set of three example images
images = [
keras_ocr.tools.read(url) for url in [
'/content/CYegU.png', # mode: RGBA; Only RGB should work too!
'/content/bw6Eq.png', # mode: RGBA;
'/content/jH2QS.png', # mode: RGBA
'/content/xbADG.png' # mode: RGBA
]
]
# Each list of predictions in prediction_groups is a list of
# (word, box) tuples.
prediction_groups = pipeline.recognize(images)
Looking for /root/.keras-ocr/craft_mlt_25k.h5
Looking for /root/.keras-ocr/crnn_kurapan.h5
prediction_groups
[[('zum', array([[ 10.658852, 15.11916 ],
[148.90204 , 13.144257],
[149.39563 , 47.694347],
[ 11.152428, 49.66925 ]], dtype=float32))],
[('sresa', array([[ 5., 15.],
[143., 15.],
[143., 48.],
[ 5., 48.]], dtype=float32))],
[('sycw', array([[ 10., 15.],
[149., 15.],
[149., 49.],
[ 10., 49.]], dtype=float32))],
[('vdivize', array([[ 10.407883, 13.685192],
[140.62648 , 16.940662],
[139.82323 , 49.070583],
[ 9.604624, 45.815113]], dtype=float32))]]
显示
# Plot the predictions
fig, axs = plt.subplots(nrows=len(images), figsize=(20, 20))
for ax, image, predictions in zip(axs, images, prediction_groups):
keras_ocr.tools.drawAnnotations(image=image, predictions=predictions, ax=ax)