来自请求的请求的响应不正确

时间:2017-03-16 08:34:27

标签: python python-3.x post beautifulsoup python-requests

搜索网址 - http://aptaapps.apta.org/findapt/Default.aspx?UniqueKey=

需要获取邮政编码的数据(10017) 发送帖子请求但我收到了搜索页面(来自搜索网址的回复),但没有收到带结果的页面。

我的代码:

# -*- coding: UTF-8 -*-

import requests
from bs4 import BeautifulSoup, element


search_url = "http://aptaapps.apta.org/findapt/Default.aspx?UniqueKey="
session = requests.Session()
r = session.get(search_url)
post_page = BeautifulSoup(r.text, "lxml")
try:
    target_value = post_page.find("input", id="__EVENTTARGET")["value"]
except TypeError:
    target_value = ""

try:
    arg_value = post_page.find("input", id="__EVENTARGUMENT")["value"]
except TypeError:
    arg_value = ""

try:
    state_value = post_page.find("input", id="__VIEWSTATE")["value"]
except TypeError:
    state_value = ""

try:
    generator_value = post_page.find("input", id="__VIEWSTATEGENERATOR")["value"]
except TypeError:
    generator_value = ""

try:
    validation_value = post_page.find("input", id="__EVENTVALIDATION")["value"]
except TypeError:
    validation_value = ""

post_data = {
            "__EVENTTARGET": target_value,
            "__EVENTARGUMENT": arg_value,
            "__VIEWSTATE": state_value,
            "__VIEWSTATEGENERATOR": generator_value,
            "__EVENTVALIDATION": validation_value,
            "ctl00$SearchTerms2": "",
            "ctl00$maincontent$txtZIP": "10017",
            "ctl00$maincontent$txtCity": "",
            "ctl00$maincontent$lstStateProvince": "",
            "ctl00$maincontent$radDist": "1",
            "ctl00$maincontent$btnSearch": "Find a Physical Therapist"
            }

headers = {
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
        "Accept-Encoding": "gzip, deflate",
        "Accept-Language": "ru-RU,ru;q=0.8,en-US;q=0.6,en;q=0.4",
        "Cache-Control": "max-age=0",
        "Content-Length": "3025",
        "Content-Type": "application/x-www-form-urlencoded",
        "Host": "aptaapps.apta.org",
        "Origin": "http://aptaapps.apta.org",
        "Proxy-Connection": "keep-alive",
        "Referer": "http://aptaapps.apta.org/findapt/default.aspx?UniqueKey=",
        "Upgrade-Insecure-Requests": "1",
        "User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36"
        }

post_r = session.post(search_url, data=post_data, headers=headers)
print(post_r.text)

1 个答案:

答案 0 :(得分:0)

简答:

尝试替换:

<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div class="floating-label">
  <input class="floating-input" type="password" placeholder=" ">
  <span class="highlight"></span>
  <label>Name</label>
</div>
<div class="floating-label">
  <a class="button floating-select" href="#popup1">
    <select class="floating-select" id="target1">
            <option value=""></option>
            <option value="1">Alabama</option>
            <option value="2">Boston</option>
            <option value="3">Ohaio</option>
            <option value="4">New York</option>
            <option value="5">Washington</option></select></a>
  <span class="highlight"></span>
  <label>Select</label>
</div>

<div class="floating-label">
  <textarea class="floating-input floating-textarea" placeholder=" "></textarea>
  <span class="highlight"></span>
  <label>Textarea</label>
</div>

<div id="popup1" class="overlay">
	<div class="popup">
		<h2>Select state</h2>
		<a class="close" href="#">&times;</a>
		<div class="content">
			Choose one of the above:
        <select id="target2" class="floating-select" onclick="this.setAttribute('value', this.value);" value="">
            <option value=""></option>
            <option value="1">Alabama</option>
            <option value="2">Boston</option>
            <option value="3">Ohaio</option>
            <option value="4">New York</option>
            <option value="5">Washington</option>
          </select>
		</div>
	</div>
</div>

为:

post_r = session.post(search_url, data=post_data, headers=headers)

长答案:

对于POST方法,有多种数据类型可供发布。例如post_r = session.post(search_url, json=post_data, headers=headers) form-datax-www-form-urlencodedapplication/json等。

您应该知道帖子数据的类型。有一个名为postman的精彩Chrome插件。您可以使用它来尝试不同的数据类型,并找到正确的数据类型。

找到后,使用file中的正确参数键,参数requests.post,如果是dataform-data。参数x-www-form-urlencoded用于json格式。您可以参考请求文档以了解有关参数的更多信息。