用登录进行Python网页报废

时间:2017-05-25 03:33:18

标签: python html web web-scraping request

我试图通过受密码保护的网站登录以访问受保护的网页,我有电子邮件和密码名称以及csrf-token.But当我尝试访问受保护的页面时页面它不允许我,并将我重定向回登录。任何帮助都会很棒!我试图访问的网站是。

https://www.usertesting.com/users/sign_in

import requests
from lxml import html

session_requests = requests.session()

login_url = "https://www.usertesting.com/users/sign_in"
result = session_requests.get(login_url)

tree = html.fromstring(result.text)
authenticity_token = list(set(tree.xpath("//meta[@name='csrf-token']/@content")))[0]

userInfo = {
    "user[email]": "email", 
    "user[password]": "password", 
    "csrf-token": authenticity_token
}

result = session_requests.post(
    login_url, 
    data = userInfo, 
    headers = dict(referer=login_url)
)

url = 'https://www.usertesting.com/my_dashboard'

result = session_requests.get(
    url, 
    headers = dict(referer = url)
)

print result.content

1 个答案:

答案 0 :(得分:0)

请尝试查看此https://kazuar.github.io/scraping-tutorial/以获取您正在寻找的答案。总而言之,您需要检查网页,在开始完整的抓取程序之前,您应该编写另一个功能,输入用户名,密码,然后进入网站。完成后,开始完整脚本。