Question

我正在尝试在我的国家建立一个类型的客户端博客平台，但博客平台有一个内部构建的验证码生成。

问题是CAPTCHA是这样构建的，因此每次有GET请求时都会生成一个新图像。因此，假设验证码图像URL为：http://example.com/randomcaptcha.aspx?someparams-that-are-always-the-same

即使我在Firefox中打开上面的链接并点击刷新（仅显示JPG图像），每次刷新时我都会看到不同的图像。

出现问题是因为当机械化下载整个网页时，它还会在该请求期间下载图像（或者更确切地说，它遵循randomcaptcha.aspx链接）。因此，当我尝试再次下载图像时，我需要发出另一个GET请求来获取图像并下载它 - 此时图像已经改变。

我该如何解决这个问题？

谢谢。

编辑目前的代码是这样的：

browser.open("http://www.example.com/registration.aspx") #this contains the randomcaptcha.aspx url in img src
#then we have a regex to find the url of the image, say the variable is url
with open("captcha.jpg", "wb") as file:
    file.write(browser.open_novisit(url).read())

此时下载的captcha.jpg文件已经与注册页面上显示的文件不同。我使用了名为Fiddler的软件来查看 - 肯定会为randomcaptcha.aspx网址发出2个GET请求。

编辑＃2解决：我的坏。验证码网址不正确。

如何使用mechanize下载生成的验证码？

0 个答案: