我正在尝试使用python请求模块登录网页,但我想登录的网站上的Post数据包含一个uuid标记。
pass: ********
user: ********
uuid: ********
在大多数请求文档中,我已经彻底搜索过任何提及此内容的内容。这仅仅是出于程序的功能还是我忽视的东西。
这是我正在使用的代码。
import requests
url = 'www.website.com'
with requests.Session() as c:
c.get(url)
values = {'pass': 'passsword', 'user': 'username'}
response = c.post(url, data=values)
print response
答案 0 :(得分:2)
您可以从源代码解析它:
In [29]: from bs4 import BeautifulSoup
In [30]: import re
In [31]: patt = re.compile("document.cplogin.uuid.value=\"(.*?)\"")
In [32]: with requests.Session() as s:
....: page = s.get('http://myneu.neu.edu/cp/home/displaylogin').content
....: soup = BeautifulSoup(page, "html.parser")
....: script = soup.find("script", language="javascript1.1")
....: uuid = patt.search(script.text).group(1)
....:
In [33]: uuid
Out[33]: u'ff3e7ddd-0823-4f44-a003-0e68a9321e08'
如果您查看登录页面的来源,在脚本中使用 language =" javascript1.1" 属性,您可以看到uuid:
function login()
{
setQueryAsCookie();
document.cplogin.user.value=document.userid.user.value;
document.cplogin.uuid.value="21fbc26a-3a3d-4802-ba4a-39a40aad881c";
document.cplogin.submit();
}
因此,当您发布时,请将其与其余表单数据一起传递。
帖子网址似乎也是 https://myneu.neu.edu/cp/home/login ,所以:
from bs4 import BeautifulSoup
import re
patt = re.compile("document.cplogin.uuid.value=\"(.*?)\"")
data = {"user":"uname", "pass":"passw"}
post = "https://myneu.neu.edu/cp/home/login"
with requests.Session() as s:
page = s.get('http://myneu.neu.edu/cp/home/displaylogin')
soup = BeautifulSoup(page.content, "html.parser")
script = soup.find("script", language="javascript1.1")
uuid = patt.search(script.text).group(1)
data["uuid"] = uuid
resp = s.post(post, data=data)