在Python中通过发布请求发送数据时遇到问题

时间:2019-07-26 01:10:39

标签: python python-requests

我试图通过发送帖子请求,在Gosport Council网站上的2个输入框中输入决策的开始和结束日期。每当我打印出发送请求后收到的文本时,它都会为我显示输入页面上显示的信息,而不是加载页面上显示的信息

import requests

payload = {
    "applicationDecisionStart": "1/8/2018",
    "applicationDecisionEnd": "1/10/2018",
}

with requests.Session() as session:
    r = session.get("https://publicaccess.gosport.gov.uk/online-applications/search.do?action=advanced", timeout=10, data=payload)

    print(r.text)

如果执行它,我希望它打印出带有href链接的HTML,例如 <a href="/online-applications/applicationDetails.do?keyVal=PEA12JHO07E00&amp;activeTab=summary"> 但是我的代码不会显示这样的内容

2 个答案:

答案 0 :(得分:2)

我观察到您正在执行的POST(而不是GET)如下(忽略POST中的空字段):

from bs4 import BeautifulSoup as bs
import requests

payload = {
    'caseAddressType':'Application'
    ,'date(applicationDecisionStart)' :'1/8/2018'
    ,'date(applicationDecisionEnd)': '1/10/2018'
    , 'searchType' : 'Application'
}

with requests.Session() as s:
    r = s.post('https://publicaccess.gosport.gov.uk/online-applications/advancedSearchResults.do?action=firstPage', data = payload)
    soup = bs(r.content, 'lxml')
    info = [(item.text.strip(), item['href']) for item in soup.select('#searchresults a')]
    print(info)
    ## later pages
    #https://publicaccess.gosport.gov.uk/online-applications/pagedSearchResults.do?action=page&searchCriteria.page=2

页面上方

from bs4 import BeautifulSoup as bs
import requests

payload = {
    'caseAddressType':'Application'
    ,'date(applicationDecisionStart)' :'1/8/2018'
    ,'date(applicationDecisionEnd)': '1/10/2018'
    , 'searchType' : 'Application'
}

with requests.Session() as s:
    r = s.post('https://publicaccess.gosport.gov.uk/online-applications/advancedSearchResults.do?action=firstPage', data = payload)
    soup = bs(r.content, 'lxml')
    info = [(item.text.strip(), item['href']) for item in soup.select('#searchresults a')]
    print(info)
    pages = int(soup.select('span + a.page')[-1].text)

    for page in range(2, pages + 1):
        r = s.get('https://publicaccess.gosport.gov.uk/online-applications/pagedSearchResults.do?action=page&searchCriteria.page={}'.format(page))
        soup = bs(r.content, 'lxml')
        info = [(item.text.strip(), item['href']) for item in soup.select('#searchresults a')]
        print(info)       

答案 1 :(得分:0)

网址和数据不正确

使用Chrome浏览器分析响应

按f12键打开开发人员工具,更改为“网络”项。然后提交您的页面,分析由Chrome发起的第一个请求。

您需要什么:

  1. Hearders-general-request网址
  2. Hearders-request标头
  3. Hearders-data

您需要一些软件包来解析html,例如bs4