python请求登录网站返回403

时间:2012-11-25 17:01:17

标签: python django pyramid python-requests

我正在尝试使用requests登录网站,但您可以猜到我遇到了问题

这是我正在使用的代码

import requests

EMAIL = '***'
PASSWORD = '***'
URL = 'https://portal.bitcasa.com/login'

client = requests.session(config={'verbose': sys.stderr})
login_data = {'username': EMAIL, 'password': PASSWORD,}
r = client.post(URL, data=login_data, headers={"Referer": "foo"})
print r

如果我打印出r.text,我就会

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head><script type="text/javascript">var NREUMQ=NREUMQ||[];NREUMQ.push(["mark","firstbyte",new Date().getTime()])</script>
  <meta http-equiv="content-type" content="text/html; charset=utf-8">
  <meta name="robots" content="NONE,NOARCHIVE">
  <title>403 Forbidden</title>
  <style type="text/css">
    html * { padding:0; margin:0; }
    body * { padding:10px 20px; }
    body * * { padding:0; }
    body { font:small sans-serif; background:#eee; }
    body>div { border-bottom:1px solid #ddd; }
    h1 { font-weight:normal; margin-bottom:.4em; }
    h1 span { font-size:60%; color:#666; font-weight:normal; }
    #info { background:#f6f6f6; }
    #info ul { margin: 0.5em 4em; }
    #info p, #summary p { padding-top:10px; }
    #summary { background: #ffc; }
    #explanation { background:#eee; border-bottom: 0px none; }
  </style>
</head>
<body>
<div id="summary">
  <h1>Forbidden <span>(403)</span></h1>
  <p>CSRF verification failed. Request aborted.</p>

</div>

<div id="explanation">
  <p><small>More information is available with DEBUG=True.</small></p>
</div>

<script type="text/javascript">if(!NREUMQ.f){NREUMQ.f=function(){NREUMQ.push(["load",new Date().getTime()]);var e=document.createElement("script");e.type="text/javascript";e.src=(("http:"===document.location.protocol)?"http:":"https:")+"//"+"d1ros97qkrwjf5.cloudfront.net/42/eum/rum.js";document.body.appendChild(e);if(NREUMQ.a)NREUMQ.a();};NREUMQ.a=window.onload;window.onload=NREUMQ.f;};NREUMQ.push(["nrfj","beacon-1.newrelic.com","0e859e0620",778660,"ZAZRbUcHWBAHURFYX11MdUxbBUIKCVxKVVpSDVRWGwtfBwJeAEZRQQYdWkYUUFklQRdXZloGRHRcAlIPA0UEQ1UdE0FWVgNFEDlEDFRH",0,7,new Date().getTime(),"","","","",""])</script></body>
</html>

他们正在使用django和金字塔的组合。

我现在已经玩了大约两天,但显然已经无处可去了。谢谢你的帮助。

2 个答案:

答案 0 :(得分:12)

登录页面使用CSRF令牌来防止跨站点脚本攻击。您需要先检索该令牌。

登录页面设置一个具有相同令牌的cookie,我们需要先加载登录页面并获取该令牌,然后再将其传递给登录POST:

client = requests.session()

# Retrieve the CSRF token first
client.get(URL)  # sets the cookie
csrftoken = client.cookies['csrftoken']

login_data = dict(username=EMAIL, password=PASSWORD, csrfmiddlewaretoken=csrftoken)
r = client.post(URL, data=login_data, headers={"Referer": "foo"})

答案 1 :(得分:4)

正如错误消息所示,您错过了csrf token

首先需要GET登录页面,然后阅读csrf令牌以及带有其余表单数据的POST