我怎样才能发送两个连续的请求,包括重定向

时间:2016-02-05 11:38:33

标签: python web-crawler python-requests

我如何发送两个连续的请求,包括重定向

我尝试使用Python请求模仿浏览器上的搜索功能。

但是,它并不像其他简单请求那么简单。

我在Chrome浏览器上打开了开发者模式,并以Curl形式复制了这两个请求,然后将其转换为Python请求表单。

我只能通过Python获得500错误,但我可以在浏览器上获得正确的响应。

inline

inline

当前代码,它只返回500错误

    cookies = {
        'optimizelyEndUserId': 'oeu1454030467608r0.5841516454238445',
        ~~~
        '_gat': '1',
    }

    headers = {
        'Origin': 'https://m.flyscoot.com',
        ~~~~
    }

    data = 'origin=KHH&destination=KIX&departureDate=20160309&returnDate=&roundTrip=false&adults=1&children=0&infants=0&promoCode='
    req = requests.session()
    resp_1 = req.post('https://m.flyscoot.com/search', headers=headers, cookies=cookies, data=data)
    headers = {
        'Accept-Encoding': 'gzip, deflate, sdch',
        ~~~~
    }

    # because the first request will be redirected to a unknown status, so I copied the first response set_cookie for the 2nd request uses.

    resp_2 = req.get('https://m.flyscoot.com/select', headers=headers, cookies=resp_1.history[0].cookies)

1 个答案:

答案 0 :(得分:0)

这似乎是移动网址。大多数情况下,您应该设置一个Web代理。试试这个(Python 3):

import urllib
import requests

FF_USER_AGENT = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.2; Win64; x64; rv:21.0.0) '
                  'Gecko/20121011 Firefox/21.0.0',
    "Origin": "http://makeabooking.flyscoot.com",
    "Referer": "http://makeabooking.flyscoot.com",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Encoding": "gzip,deflate,sdch",
    "Accept-Language": "fr-FR,fr;q=0.8,en-US;q=0.6,en;q=0.4",
    "Cache-Control": "max-age=0",
    "Connection": "keep-alive",
}

req = requests.session()
resp_1 = req.get('http://makeabooking.flyscoot.com/', headers=FF_USER_AGENT)
# form urlencoded data
raw_data = (
    "availabilitySearch.SearchInfo.SearchStations%5B0%5D.DepartureStationCode"
    "=ADL"
    "&availabilitySearch.SearchInfo.SearchStations%5B0%5D.ArrivalStationCode"
    "=SIN"
    "&availabilitySearch.SearchInfo.SearchStations%5B0%5D.DepartureDate=2%2F17"
    "%2F2016&availabilitySearch.SearchInfo.SearchStations%5B1%5D"
    ".DepartureStationCode=SIN&availabilitySearch.SearchInfo.SearchStations%5B1"
    "%5D.ArrivalStationCode=ADL&availabilitySearch.SearchInfo.SearchStations"
    "%5B1"
    "%5D.DepartureDate=3%2F17%2F2016&availabilitySearch.SearchInfo.Direction"
    "=Return&Singapore+%28SIN%29=Singapore+%28SIN%29&availabilitySearch"
    ".SearchInfo.AdultCount=1&availabilitySearch.SearchInfo.ChildrenCount=0"
    "&availabilitySearch.SearchInfo.InfantCount=0&availabilitySearch.SearchInfo"
    ".PromoCode=")
dict_data = dict(urllib.parse.parse_qsl(raw_data))
final = req.post('http://makeabooking.flyscoot.com/',
                 headers=FF_USER_AGENT,
                 data=dict_data)
print(final.status_code)
print(final.url)

[MOBILE Version]

import urllib
import requests

# debug request
import http.client
http.client.HTTPConnection.debuglevel = 1
import logging
logging.basicConfig()
logging.getLogger().setLevel(logging.DEBUG)
requests_log = logging.getLogger("requests.packages.urllib3")
requests_log.setLevel(logging.DEBUG)
requests_log.propagate = True

FF_USER_AGENT = {
    'User-Agent': "Mozilla/5.0 (iPhone; CPU iPhone OS 8_0 like Mac OS X) AppleWebKit/600.1.3 (KHTML, like Gecko) Version/8.0 Mobile/12A4345d Safari/600.1.4",
    "Origin": "https://m.flyscoot.com",
    "Referer": "https://m.flyscoot.com/search",
    "Host": "m.flyscoot.com",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Encoding": "gzip,deflate",
    "Accept-Language": "fr-FR,fr;q=0.8,en-US;q=0.6,en;q=0.4",
    "Cache-Control": "max-age=0",
    "Connection": "keep-alive",
    "X-Requested-With": "XMLHttpRequest",
    "Content-Type": "application/x-www-form-urlencoded; charset=UTF-8",

}

req = requests.session()
resp_1 = req.get('https://m.flyscoot.com', headers=FF_USER_AGENT)
# form urlencoded data
raw_data = (
"origin=MEL&destination=CAN&departureDate=20160220&returnDate=20160227&roundTrip=true&adults=1&children=0&infants=0&promoCode=")
dict_data = dict(urllib.parse.parse_qsl(raw_data))
final = req.post('https://m.flyscoot.com/search',
                 headers=FF_USER_AGENT,
                 data=dict_data)