我正在尝试使用Python和Mechanize库编写Web爬虫代码。我陷入了身份验证步骤,因为当我尝试提交表单时,我收到了400错误的请求,但是浏览器就可以了。
我的python代码:
import mechanize
import http.cookiejar
br = mechanize.Browser()
cj = http.cookiejar.LWPCookieJar()
br.set_cookiejar(cj)
br.set_debug_http(True)
br.set_handle_equiv(True)
br.set_handle_gzip(True)
br.set_handle_redirect(True)
br.set_handle_referer(True)
br.set_handle_robots(False)
br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1)
br.addheaders = [('User-agent', 'Mozilla/5.0')]
br.open('https://bintray.com/login')
# Select the second (index one) form (the first form is a search query box)
br.select_form(nr=1)
username_field = br.form.find_control(id = 'username')
username_field.value = '<username>'
password_field = br.form.find_control(id = "password")
password_field.value = '<password>'
response = br.submit()
当我在Firefox中单击“提交”按钮时,生成的http是:
POST /login/signIn HTTP/1.1
Host: bintray.com
Connection: keep-alive
Content-Length: 25
Accept: */*
Origin: https://bintray.com
X-Requested-With: XMLHttpRequest
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.87 Safari/537.36
Sec-Fetch-Mode: cors
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
Sec-Fetch-Site: same-origin
Referer: https://bintray.com/login
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Cookie: _ga=GA1.2.1918181459.1573714731; _gid=GA1.2.778939630.1573714731; _mkto_trk=id:256-FNZ-187&token:_mch-bintray.com-1573714730963-27642; _biz_uid=3e2df2a739744b43d74654c7f7ae663d; _fbp=fb.1.1573714731361.1960224524; _hjid=8e244023-d211-48a3-8309-e68b3d77f5b0; _biz_flagsA=%7B%22Version%22%3A1%2C%22Mkto%22%3A%221%22%2C%22XDomain%22%3A%221%22%2C%22Frm%22%3A%221%22%7D; JSESSIONID=51A27C863D8F59A85956B0AD78AB15D2; _dc_gtm_UA-36807562-1=1; _biz_sid=632e4a; _gat_UA-36807562-1=1; trwv.uid=jfrog-1573714731080-e4ac95c2%3A3; trwsa.sid=jfrog-1573746411047-cbd46fa2%3A3; _biz_nA=61; _biz_pendingA=%5B%5D
当我尝试使用机械化请求相同内容时,它看起来像这样:
POST /login/signIn HTTP/1.1
Accept-Encoding: gzip
Content-Type: application/x-www-form-urlencoded
Referer: https://bintray.com/login
Cookie: JSESSIONID=C839335ED741A4DDF5DE733AB54EECA1
Content-Length: 0
User-Agent: Mozilla/5.0
Host: bintray.com
Connection: close
我该怎么办?预先感谢!