我正试图从贷款俱乐部下载数据的网址栏中删除当前的贷款通知单状态。例如https://lendingclub.com/browse/loanDetail.action?loan_id=104046830 它需要登录才能提取信息。
我已按照步骤创建登录会话,但似乎无法成功登录。结果不包含正确的代码。有人可以帮我确定问题吗?
USERNAME = "username"
PASSWORD = "password"
LOGIN_URL = "https://www.lendingclub.com/auth/login?"
loan_id=96490539
URL = "https://lendingclub.com/browse/loanDetail.action?loan_id=96490539"
def main():
session_requests = requests.session()
# Get login csrf token
result = session_requests.get(LOGIN_URL)
tree = html.fromstring(result.text)
authenticity_token = tree.xpath("//meta[@name='csrf-token']/@content")[0]
# Create payload
payload = {
"login_email": USERNAME,
"login_password": PASSWORD,
"csrf-token": authenticity_token
}
# Perform login
result = session_requests.post(LOGIN_URL, data = payload, headers = dict(referer = LOGIN_URL))
# Scrape url
result = session_requests.get(URL, headers = dict(referer = URL))
return result
答案 0 :(得分:0)
虽然我建议的看起来很奇怪,但你可以尝试一下。根据chrome dev工具,它应该足以获取有效的响应。
import requests
from lxml import html
USERNAME = "username"
PASSWORD = "password"
LOGIN_URL = "https://www.lendingclub.com/account/login.action"
def main():
payload={
'login_url':'/browse/loanDetail.action?loan_id=96490539',
'login_email':USERNAME,
'login_password':PASSWORD,
'offeredNotListedPromotionFlag':''
}
with requests.session() as session:
session.headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36'}
result = session.post(LOGIN_URL, data=payload,
headers={'Referer':'https://www.lendingclub.com/browse/loanDetail.action?loan_id=96490539','Content-Type': 'application/x-www-form-urlencoded'})
return result