获取响应标头,发布请求python

时间:2019-11-20 18:33:34

标签: python post python-requests

我正在尝试抓取Jaap.nl,但是我遇到了一些困难。例如,当您要搜索阿姆斯特丹的城市时,会将您重定向到仅包含阿姆斯特丹以外的网址。

base_url:https://www.jaap.nl/koophuizen/> https://www.jaap.nl/koophuizen/noord+holland/groot-amsterdam/amsterdam

我想捕获额外的内容(noord + holland / groot-amsterdam / amsterdam)。我看到在将get重定向到该页面之前,存在一个Post请求,以获取标题中的扩展URL作为位置,但是我无法在代码中捕获该片段。参见下面的代码:

def post_page(type="koophuizen", city="amsterdam"):
    url = f"https://www.jaap.nl/{type}"
    headers = {"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:66.0) Gecko/20100101 Firefox/66.0",
               "Content-Type": "application/x-www-form-urlencoded"}
    payload = {"action": "searchExtensive",
               "url": f"/{type}",
               "search_input_extensive": city}
    response = requests.post(url, data=json.dumps(payload), headers=headers)
    print(response.headers)
post_page()

我收到以下答复:

    {'Cache-Control': 'private', 
     'Content-Type': 'text/html; charset=utf-8', 
     'Content-Encoding': 'gzip', 
     'Vary': 'Accept-Encoding', 
     'Server': 'Microsoft-IIS/8.5', 
     'Set-Cookie': 'SESSIONToken=7f8c65d3-7962-41a8-9604-a996957fd0ad; expires=Tue, 20-Nov-2029 23:11:36 GMT; path=/; HttpOnly, lastcity=76; path=/', 
     'X-AspNetMvc-Version': '4.0', 
     'X-AspNet-Version': '4.0.30319', 
     'X-Powered-By': 'ASP.NET, ARR/3.0, ASP.NET', 
     'strict-transport-security': 'max-age=31536000; includeSubdomains', 
     'X-Handled-By': 'TORNADO', 
     'X-Jaap-Router': 'Routed', 
     'X-Frame-Options': 'SAMEORIGIN', 
     'Date': 'Wed, 20 Nov 2019 23:11:36 GMT', 
     'Content-Length': '32956'}

正在寻找:

    "Location": "/koophuizen/noord+holland/groot-amsterdam/amsterdam"

正如我在浏览器中检查发帖请求响应标头时所看到的那样

我不断得到200作为响应代码,而即使allow_redirects = False,我也在寻找302并使用Session来保存cookie,我无法使其正常工作。

有人可以告诉我我在做什么错吗...?

1 个答案:

答案 0 :(得分:1)

这对我有用

import requests

city_to_search=str(input("Insert your city"))

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:71.0) Gecko/20100101 Firefox/71.0',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Language': 'it-IT,it;q=0.8,en-US;q=0.5,en;q=0.3',
    'Content-Type': 'application/x-www-form-urlencoded',
    'Origin': 'https://www.jaap.nl',
    'DNT': '1',
    'Connection': 'keep-alive',
    'Referer': 'https://www.jaap.nl/koophuizen/',
    'Upgrade-Insecure-Requests': '1',
}

data = {
  'action': 'searchExtensive',
  'url': '/koophuizen',
  'search_input_extensive': city_to_search
}

response = requests.post('https://www.jaap.nl/', headers=headers, data=data)