如何使用 requests.Session 从特定网址下载图像验证码

时间:2021-03-22 21:08:18

标签: python python-requests python-imaging-library

大家好,我正在尝试获取网站中的图像验证码以抓取。我的问题是获取图像验证码的 url 包含一个参数,我无法找到它的来源。所以我使用了 parser.xpath 但它不起作用。这是我的代码:

import requests, io, re
from PIL import Image
from lxml import html
headers = {
    "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebkit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.83 Safari/537.36",
}
session = requests.Session()
login_url = 'https://www.sat.gob.pe/WebSiteV8/popupv2.aspx?t=6'
login_form_res = session.get(login_url, headers=headers)
myhtml = login_form_res.text
evalu = ''
for match in re.finditer(r'(mysession=)(.*?)(")', myhtml):
    evalu = myhtml[match.start():match.end()]
    evalu = evalu.replace("mysession=", "")
    evalu = evalu.replace('"', '')
    print(evalu)

url_infractions = 'https://www.sat.gob.pe/VirtualSAT/modulos/RecordConductor.aspx?mysession=' + evalu
login_form_res = session.get(url_infractions, headers=headers)
myhtml = login_form_res.text
parser = html.fromstring(login_form_res.text)
idPic = parser.xpath('//img[@class="captcha_class"]/@src')
urlPic = "https://www.sat.gob.pe/VirtualSAT" + idPic[0].replace("..","")
print(urlPic)

image_content = session.get(urlPic, headers=headers)
image_file = io.BytesIO(image_content)
image = Image.open(image_file).convert('RGB').content
image.show()

结果我有一个异常,它说 TypeError: a bytes-like object is required, not 'Response'.我很困惑。我将非常感谢您的帮助。提前致谢

0 个答案:

没有答案
相关问题