Python中的验证码求解器

时间:2019-06-16 16:37:56

标签: python python-3.x

我正在尝试为webscrapping项目制作一个验证码阅读器。

基于此链接中提到的步骤 https://www.scrapehero.com/how-to-solve-simple-captchas-using-python-tesseract/ ,我已经尝试了该过程,但是遇到以下错误;有人可以告诉我这是什么问题吗?

import pytesseract
import sys
import argparse

try:
    import Image
except ImportError:
    from PIL import Image
from subprocess import check_output


def resolve(path):
    print("Resampling the Image")
    check_output(['convert', path, '-resample', '600', path])
    return pytesseract.image_to_string(Image.open(path))



if __name__ == "__main__":
    argparser = argparse.ArgumentParser()
    argparser.add_argument('path', help='Captcha file path')
    args = argparser.parse_args()
    path = args.path
    print('Resolving Captcha')
    captcha_text = resolve(path)
    print('Extracted Text', captcha_text)

预期的结果是将6个字母的验证码显示为文本。但我收到此错误:

Resolving Captcha
Resampling the Image
Invalid Parameter - -resample
Traceback (most recent call last):
  File "captcha_resolver.py", line 25, in <module>
    captcha_text = resolve(path)
  File "captcha_resolver.py", line 14, in resolve
    check_output(['convert', path, '-resample', '600', path])
  File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 395, in check_output
    **kwargs).stdout

其他尝试此操作的人也面临着类似的问题,如下所示: https://gist.github.com/scrapehero/b85a280dc0d993f665c91e0332cf618f

0 个答案:

没有答案