如何使用phython请求登录网站?

时间:2018-07-26 20:18:51

标签: python post login python-requests

该网站的注册是免费的:http://software.broadinstitute.org/gsea/login.jsp

我根据一些教程编写了这段代码以登录网站:

import requests

url = "http://software.broadinstitute.org/gsea/login.jsp"

# Fill in your details here to be posted to the login form.
payload = {
    'j_username': 'xxx@gmail.com',
    'j_password': 'password'
}

# Use 'with' to ensure the session context is closed after use.
s = request.session()
p = s.post(url, data=payload)
# print the html returned or something more intelligent to see if it's a successful login page.
print p, p.url, p.status_code
print 'is redirected: ', p.is_redirect

r = s.get("https://software.broadinstitute.org/gsea/msigdb/download_file.jsp?filePath=/resources/msigdb/6.2/msigdb_v6.2.xml")
# print r.text
print r, r.url, r.status_code
print 'is redirected: ', r.is_redirect

with open("lol.xml", "wb") as handle:
    handle.write(r.content)

我不确定是否必须填写密码,因为密码是隐藏的?

post命令返回200 OK,但我仍未登录:

<Response [200]> http://software.broadinstitute.org/gsea/login.jsp 200
is redirected:  False
<Response [200]> https://software.broadinstitute.org/gsea/login.jsp 200
is redirected:  False

好的,可能的错误源是带有错误的字典键的有效负载。

html代码如下:

<form id="loginForm" name="loginForm" action="j_spring_security_check" method="POST">
          <table border="0" class="bodyfont" cellpadding="5" cellspacing="5">
            <tbody><tr>
              <td colspan="2" align="left">Items marked with <font color="red">*</font> are required.</td>
            </tr>
            <tr>
              <td colspan="2">&nbsp;</td>
            </tr>
            <tr>
              <td><h3>Email:&nbsp;<font color="red">*</font>&nbsp;</h3></td>
              <td><input id="email" type="text" name="j_username" value="">
              <input id="password" type="hidden" name="j_password" value="password"></td>
            </tr>
            <tr>
          <td>&nbsp;</td>
          <td><input type="button" name="login" value="login" style="margin-top:10px;" onclick="validateForm()"></td>
        </tr>
      </tbody></table>
    </form>

我想念什么吗?为什么不登录?

1 个答案:

答案 0 :(得分:1)

正如我在评论中提到的那样,当您想通过请求登录某处时,查看Chrome的“网络”标签中的日志是非常好的第一步。您的代码无效,因为您只是在请求后使用错误的网址!您的代码中也有一些类型错误,例如:request.session()而不是requests.session()

import requests


login_url = "http://software.broadinstitute.org/gsea/j_spring_security_check"
url = "http://software.broadinstitute.org/gsea/index.jsp"
payload = {
    'j_username': 'a4702585@nwytg.net',
    'j_password': 'password'
}

with requests.Session() as session:
    login = session.post(login_url, data=payload)
    req = session.get(url)

总的来说,我也怀疑查看响应状态代码是否是确定登录是否有效的好方法。

  1. 打开会话
  2. 将包含有效负载的发布请求发送到右侧网址
  3. 发出get请求以模拟在浏览器中自动发生的重定向(同样,您可以使用chrome网络标签轻松学到一些东西)

您可以用显然想要的任何子域替换该网址...