我遇到的问题 - 并尝试用Python解决 - 是为网站发出连续的POST请求(填写在线表单)(具体来说,是http://demo.travelportuniversalapi.com的API的免费在线演示)。到目前为止,我无法获得结果页面 - 现在已经有两天了。
我使用的代码是:
import sys
import urllib, urllib2, cookielib
from BeautifulSoup import BeautifulSoup
import re
class website:
def __init__(self):
self.host = 'demo.travelportuniversalapi.com'
self.ua = 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0'
self.session = cookielib.CookieJar() #session devine o instanta a obiectului cookielib
pass
def get(self):
try:
url = 'http://demo.travelportuniversalapi.com/(S(cexfuhghvlzyzx5n0ysesra1))/Search' #this varies every 20 minutes
data = None
headers = {'User-Agent': self.ua}
request = urllib2.Request(url, data, headers)
self.session.add_cookie_header(request)
response = urllib2.urlopen(request)
self.session.extract_cookies(response, request)
url = response.geturl()
data = {'From': 'lhr', 'To': 'ams', 'Departure' : '9/4/2013','Return' : '9/6/2013'}
headers = {'User-Agent': self.ua, "Content-type": "application/x-www-form-urlencoded; charset=UTF-8",
}
request = urllib2.Request(url, urllib.urlencode(data), headers, 20)
self.session.add_cookie_header(request)
response = urllib2.urlopen(request, timeout=30) #HTTP Error 404: Not Found - aici am eroare
self.session.extract_cookies(response, request)
except urllib2.URLError as e:
print >> sys.stderr, e
return None
rt = website()
rt.get()
我在上次urllib2.Request
收到的错误是HTTP错误404:未找到。我不确定我的饼干是否有效。
在浏览器中使用插件监控HTTP数据包我发现在broswer中发送POST时,我注意到以下标题:'X-Requested-With XMLHttpRequest' - 这是否相关?