模块“请求”不随登录一起通过

时间:2018-12-12 14:34:25

标签: python python-3.x input python-requests

我正在尝试使用requests模块从网站获取信息。要获取信息,您必须先登录,然后才能访问该页面。我查看了输入标签,发现它们分别称为login_usernamelogin_password,但是由于某些原因,post没有通过。我还读过here,说他通过等待几秒钟才解决了这一问题,然后再浏览另一页,它也无济于事。

这是我的代码:

import requests
import time

#This URL will be the URL that your login form points to with the "action" tag.
loginurl = 'https://jadepanel.nephrite.ro/login'

#This URL is the page you actually want to pull down with requests.
requesturl = 'https://jadepanel.nephrite.ro/clan/view/123'

payload = {
    'login_username': 'username',
    'login_password': 'password'
}

with requests.Session() as session:
    post = session.post(loginurl, data=payload)
    time.sleep(3)
    r = session.get(requesturl)
    print(r.text)

1 个答案:

答案 0 :(得分:2)

login_usernamelogin_password并不是所有必需的参数。如果您在浏览器开发人员工具中查看/login/ POST请求,您会发现还有一个_token正在发送。

这是您需要从登录HTML中解析的内容。因此流程如下:

  • 获取https://jadepanel.nephrite.ro/login
  • HTML对其进行解析并提取_token
  • 通过登录名,密码和令牌发出POST请求
  • 使用登录的会话浏览网站

对于HTML解析,我们可以使用BeautifulSoup(当然还有其他选择):

from bs4 import BeautifulSoup

login_html = session.get(loginurl).text
soup = BeautifulSoup(login_html, "html.parser")

token = soup.find("input", {"name": "_token"})["value"]

payload = {
    'login_username': 'username',
    'login_password': 'password',
    '_token': token
}

完整代码:

import time

import requests
from bs4 import BeautifulSoup


# This URL will be the URL that your login form points to with the "action" tag.
loginurl = 'https://jadepanel.nephrite.ro/login'

# This URL is the page you actually want to pull down with requests.
requesturl = 'https://jadepanel.nephrite.ro/clan/view/123'

with requests.Session() as session:
    login_html = session.get(loginurl).text
    soup = BeautifulSoup(login_html, "html.parser")

    token = soup.find("input", {"name": "_token"})["value"]

    payload = {
        'login_username': 'username',
        'login_password': 'password',
        '_token': token
    }

    post = session.post(loginurl, data=payload)
    time.sleep(3)
    r = session.get(requesturl)
    print(r.text)