以下是在页面中登录并获取源代码的代码。
import requests
import sys
import urllib, urllib2, cookielib
USERNAME = ''
PASSWORD = ''
URL = 'http://coned.com'
def main():
# Start a session so we can have persistant cookies
session = requests.session()
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
# This is the form data that the page sends when logging in
login_data = {
'TxtUser': USERNAME,
'TxtPwd': PASSWORD,
'submit': 'Sign In',
}
# Authenticate
r = session.post(URL, data=login_data)
# Try accessing a page that requires you to be logged in
r = session.get('https://apps1.coned.com/cemyaccount/MemberPages/MyAccounts.aspx?lang=eng')
resp = opener.open('https://apps1.coned.com/cemyaccount/MemberPages/MyAccounts.aspx?lang=eng')
print resp
print r.text
if __name__ == '__main__':
main()
这里r.text不起作用,登录后我需要页面的html代码。 谁能帮助我在这做什么?
答案 0 :(得分:0)
在Chrome中打开http://coned.com
并打开“开发人员工具”窗格,我可以在下面跟踪我尝试的登录信息。我使用testtesttest
作为用户名,使用test
作为密码。
接头:
Request URL: https://apps2.coned.com/cemyaccount/NonMemberPages/Login.aspx?lang=eng
Request Method: POST
Status Code: 200 OK
数据:
TxtUser:testtesttest
UserName:VALUE
UserName:0
TxtPwd:test
UserName2:VALUE
UserName2:0
ctl00$Main$Login1$LoginButton:Sign In
了解这一点,您应该使用其他参数构建数据字典:
URL = 'https://apps2.coned.com/cemyaccount/NonMemberPages/Login.aspx?lang=eng'
# This is the form data that the page sends when logging in
login_data = {
'TxtUser': USERNAME,
'UserName': 'VALUE',
'UserName': '0',
'TxtPwd': PASSWORD,
'UserName2': 'VALUE',
'UserName2': '0',
'ctl00$Main$Login1$LoginButton': 'Sign In',
}
# Authenticate to the login page
r = session.post(URL, data=login_data)
# now, r.text will contain the html results of the page you just requested. In this case, the login page's redirected response.
# Check if the word successful appears in the results...
print filter(lambda x: 'success' in x.lower(), r.text.splitlines())
该网站似乎会向您显示登录页面,如果您的登录无效,则该页面包含一个额外的HTML:
<span id="ctl00_Main_FailureMsg">Your sign In attempt was not successful. Please try again. If you have not created your registry information you can register now.</span>
最后,您还应该考虑mechanize或scrapy。这两个工具都有很好的文档记录,专门用于完成你所追求的工作。
希望能指出你的方向。