该网站的注册是免费的:http://software.broadinstitute.org/gsea/login.jsp
我根据一些教程编写了这段代码以登录网站:
import requests
url = "http://software.broadinstitute.org/gsea/login.jsp"
# Fill in your details here to be posted to the login form.
payload = {
'j_username': 'xxx@gmail.com',
'j_password': 'password'
}
# Use 'with' to ensure the session context is closed after use.
s = request.session()
p = s.post(url, data=payload)
# print the html returned or something more intelligent to see if it's a successful login page.
print p, p.url, p.status_code
print 'is redirected: ', p.is_redirect
r = s.get("https://software.broadinstitute.org/gsea/msigdb/download_file.jsp?filePath=/resources/msigdb/6.2/msigdb_v6.2.xml")
# print r.text
print r, r.url, r.status_code
print 'is redirected: ', r.is_redirect
with open("lol.xml", "wb") as handle:
handle.write(r.content)
我不确定是否必须填写密码,因为密码是隐藏的?
post命令返回200 OK,但我仍未登录:
<Response [200]> http://software.broadinstitute.org/gsea/login.jsp 200
is redirected: False
<Response [200]> https://software.broadinstitute.org/gsea/login.jsp 200
is redirected: False
好的,可能的错误源是带有错误的字典键的有效负载。
html代码如下:
<form id="loginForm" name="loginForm" action="j_spring_security_check" method="POST">
<table border="0" class="bodyfont" cellpadding="5" cellspacing="5">
<tbody><tr>
<td colspan="2" align="left">Items marked with <font color="red">*</font> are required.</td>
</tr>
<tr>
<td colspan="2"> </td>
</tr>
<tr>
<td><h3>Email: <font color="red">*</font> </h3></td>
<td><input id="email" type="text" name="j_username" value="">
<input id="password" type="hidden" name="j_password" value="password"></td>
</tr>
<tr>
<td> </td>
<td><input type="button" name="login" value="login" style="margin-top:10px;" onclick="validateForm()"></td>
</tr>
</tbody></table>
</form>
我想念什么吗?为什么不登录?
答案 0 :(得分:1)
正如我在评论中提到的那样,当您想通过请求登录某处时,查看Chrome的“网络”标签中的日志是非常好的第一步。您的代码无效,因为您只是在请求后使用错误的网址!您的代码中也有一些类型错误,例如:request.session()
而不是requests.session()
。
import requests
login_url = "http://software.broadinstitute.org/gsea/j_spring_security_check"
url = "http://software.broadinstitute.org/gsea/index.jsp"
payload = {
'j_username': 'a4702585@nwytg.net',
'j_password': 'password'
}
with requests.Session() as session:
login = session.post(login_url, data=payload)
req = session.get(url)
总的来说,我也怀疑查看响应状态代码是否是确定登录是否有效的好方法。
您可以用显然想要的任何子域替换该网址...