专家,
使用谷歌应用引擎检索甚至最基本的网站似乎非常具有挑战性!
在我的情况下,我想在此网址检索网站:
http://tdbank.mortgagewebcenter.com/PowerSite/CheckRates.aspx/Index/9809
我想接受所有Cookie,然后POST回复此网址(这是一个简单的表单帖子):
http://tdbank.mortgagewebcenter.com/PowerSite/CheckRates.aspx/Search
我想发布的字符串是:
'POSTDATA': 'Q585=1&Q2926=1&Q586=200000&Q587=240000&Q588=&Q9166=07071&Q591=1&Q592=1&Q594=3&searchButton=Search'
我遇到的问题是该网页在我的网络浏览器上声明“未启用Cookie”。
正如您从下面的代码中看到的,我尝试手动添加Cookie,但这并不成功。
请帮忙! -Todd
import cgi
import webapp2
import gzip
import StringIO
from google.appengine.api import users
from google.appengine.api import urlfetch
from BeautifulSoup import BeautifulSoup
class MainPage(webapp2.RequestHandler):
def get(self):
# self.response.headers['Content-Type'] = 'text/html'
url = "http://tdbank.mortgagewebcenter.com/PowerSite/CheckRates.aspx/Index/9809"
result = urlfetch.fetch(url,
headers={'User-Agent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 5_1_1 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9B206 Safari/7534.48.3',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'en-us',
'Accept-Encoding': 'gzip',
'Connection': 'keep-alive'})
cookie = result.headers.get('set-cookie')
input_text = {'User-Agent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 5_1_1 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9B206 Safari/7534.48.3',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'en-us',
'Accept-Encoding': 'gzip',
'Connection': 'keep-alive',
'Content-Type': 'application/x-www-form-urlencoded',
'Content-Length': '97',
'POSTDATA': 'Q585=1&Q2926=1&Q586=200000&Q587=240000&Q588=&Q9166=07071&Q591=1&Q592=1&Q594=3&searchButton=Search',
'SiteProfile':'ProfileId=9809',
's_sess':'c_m=undefinedfeedity.comfeedity.com; s_sq=; s_cc=true;;',
'bhCookieSaveSess':'1',
'bhPrevResults':'bhjs=1&bhrf=http://www.google.com/'}
input_text['set-cookie'] = cookie
## self.response.out.write(input_text)
url2 = "http://tdbank.mortgagewebcenter.com/PowerSite/CheckRates.aspx/Search"
result2 = urlfetch.fetch(url2, method='POST',headers=input_text)
# self.response.out.write(result.headers)
f = StringIO.StringIO(result2.content)
c = gzip.GzipFile(fileobj=f)
content = c.read()
self.response.out.write(content)
# self.response.out.write(result.content)
app = webapp2.WSGIApplication([('/', MainPage)],
debug=True)
yaml文件:
application: fimrates
version: 1
runtime: python27
api_version: 1
threadsafe: true
handlers:
- url: /.*
script: fimrates.app
答案 0 :(得分:1)
您实际上并未使用代码设置Cookie。这是您的代码的修订版本。
#import this module to be able to create a cookie
import Cookie
class BankHandler(webapp2.RequestHandler):
#put this function to create the cookie header
def createCookieHeader(self, cookie):
cookieHeader = ""
for value in cookie.values():
cookieHeader += "%s=%s; " % (value.key, value.value)
return cookieHeader
def get(self):
...
self.cookie = Cookie.SimpleCookie() #create the cookie
result = urlfetch.fetch({...,
#inject the cookie header
'Cookie': self.createCookieHeader(self.cookie)})
cookie = result.headers.get('set-cookie', '')
input_text = {...}
#the header is 'Cookie'
input_text['Cookie'] = cookie
url2 = "http://tdbank.mortgagewebcenter.com/PowerSite/CheckRates.aspx/Search"
result2 = urlfetch.fetch(url2, method='POST',headers=input_text)
...