我有一些python代码使用requests
模块访问特定网站上的页面。我试图使用标准的libraby来重做这个,特别是urllib2
。
这是使用requests
的 WORKING 代码:
import requests
#Now create a session
s = requests.session()
#look at login screen at http://supercoach.heraldsun.com.au/
#looking closely at what gets 'POSTed' when you submit the form
#it is sending the following params:
data = {
'username' :'test_account@scscorecollector.info',
'password' :'magicword',
'remember_me' :'on',
'channel' :'pc',
'site' :'HeraldSun',
'cancelUrl' :'http://supercoach.heraldsun.com.au/?identity-login-error=1',
'relayState' :'http://supercoach.heraldsun.com.au/?login=1',
'location' :'http://saml.cam.idmndm.com',
}
#This data gets POSTed to https://idp.news.com.au/idp/Authn/rest, but we want to read a differnet url
url_post = 'https://idp.news.com.au/idp/Authn/rest/'
url_read ='http://supercoach.heraldsun.com.au/team/other_teams?tid=11088'
r = s.post(url_post, data=data)
#Now that you are logged in, you can call the URL we want to read:
r = s.get(url_read)
print r
#<Response [200]>
现在我正在努力解决如何使用urllib2
复制此问题。以下是我的 NOT WORKING 尝试。
import urllib2
import urllib
import cookielib
data = {'username' :'test_account@scscorecollector.info',
'password' :'magicword',
'remember_me' :'on',
'channel' :'pc',
'site' :'HeraldSun',
'cancelUrl' :'http://supercoach.heraldsun.com.au/?identity-login-error=1',
'relayState' :'http://supercoach.heraldsun.com.au/?login=1',
'location' :'http://saml.cam.idmndm.com'}
#some data
url_post = 'https://idp.news.com.au/idp/Authn/rest'
url_read ='http://supercoach.heraldsun.com.au/team/other_teams?tid=11088'
user_agent = 'Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Firefox/31.0'
headers = { 'User-Agent' : user_agent }
data_encode = urllib.urlencode(data)
# setup cookie handler and create opener
cookie_jar = cookielib.LWPCookieJar()
cookie = urllib2.HTTPCookieProcessor(cookie_jar)
opener = urllib2.build_opener(cookie)
# post to
req = urllib2.Request(url_post, data_encode, headers)
req.get_method = lambda: 'POST'
response = opener.open(req)
返回以下错误
urllib2.HTTPError: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
Found
这是因为我正在发布一个与我想要阅读的网址不同的网址吗?任何人都可以为这个问题提供解决方案吗?
由于 约翰