我想先道歉。我知道这很可能已经完成了很多次,我只是打败了一匹死马,但我真的很想知道如何让它发挥作用。我正在尝试使用python的Requests模块来登录网站并验证它是否有效。我也在代码中使用BeautifulSoup,以便找到一些我必须用来处理请求的字符串。
我对如何正确形成标题感到困惑。标题信息中究竟需要什么?
import requests
from bs4 import BeautifulSoup
session = requests.session()
requester = session.get('http://iweb.craven.k12.nc.us/index.php')
soup = BeautifulSoup(requester.text)
ps = soup.find_all('input')
def getCookieInfo():
result = []
for item in ps:
if (item.attrs['name'] == 'return' and item.attrs['type'] == 'hidden'):
strcom = item.attrs['value']
sibling = item.next_sibling.next_sibling.attrs['name']
result.append(strcom)
result.append(sibling)
return result
cookiedInfo=getCookieInfo()
payload = [('username','myUsername'),
('password','myPassword'),
('Submit','Log in'),
('option','com_users'),
('task','user.login'),
('return', cookiedInfo[0]),
(cookiedInfo[1], '1')
]
headers = {
'Connection': 'keep-alive',
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Origin':'http://iweb.craven.k12.nc.us',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64)'
}
r = session.post('http://iweb.craven.k12.nc.us/index.php', data=payload, headers=headers)
r = session.get('http://iweb.craven.k12.nc.us')
soup = BeautifulSoup(r.text)
如果使用机械化模块会更好/更pythonic我会接受建议。