Question

我试图使用Python中的请求模块来处理cgi，并且无法弄清楚我做错了什么。

我已尝试在Chrome中使用Google Dev Tools提供正确的参数和数据，但我还没有完全修复它。

我尝试从中获取数据的网站是：http://staffordshirebmd.org.uk/cgi/birthind.cgi

这是我的代码

import requests 

headers = {"Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
        "Accept-Encoding":"gzip,deflate,sdch",
        "Accept-Language":"en-US,en;q=0.8",
        "Cache-Control":"no-cache",
        "Connection":"keep-alive",
        "Content-Length":"124",
        "Content-Type":"application/x-www-form-urlencoded",
        "DNT":"1",
        "Host":"staffordshirebmd.org.uk",
        "Origin":"http://staffordshirebmd.org.uk",
        "Pragma":"no-cache",
        "Referer":"http://staffordshirebmd.org.uk/cgi/birthind.cgi?county=staffordshire",
        "User-Agent":"Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25"}

payload = {"county":"staffordshire",
          "lang": "",
          "year_date":"1837",
          "search_region":"All",
          "sort_by":"alpha",
          "csv_or_list":"screen",
          "letter":"A",
          "submit":"Display Indexes"}

f = requests.put(path, data=payload, headers=headers)

f.text

这提供了回复：

u'<html>\n<body>\n<div>\n<p>\nThe Bookmark you have used to reach this page is not valid.\n</p>\n<p>\nPlease click <a href="http://staffordshirebmd.org.uk/">here</a> to return to the main page and reset your\nbookmark to that page.\n</p>\n</div>\n</body>\n</html>\n\n'

我做错了什么？

Answer 1

您在Referrer标头中使用的网址的表单使用POST，而不是PUT。

你很少应该对Content-Length标题进行硬编码;请将其保留至requests以便为您计算并设置。不同的浏览器可以轻松地使用微妙的不同内容长度，只适用于固定Content-Length的脚本不会持续很长时间。

删除Content-Length标题并将.put()更改为.post()，无论如何都会给我带来结果。

如何使用Python请求处理cgi表单

1 个答案: