我正在寻找一种在Python 3中解析http请求的本地方法。
This question显示了在Python 2中实现它的方法,但现在使用已弃用的模块(和Python 2),我正在寻找一种在Python 3中实现它的方法。
我主要想知道请求的资源是什么,并从简单的请求中解析标头。 (即):
GET /index.html HTTP/1.1
Host: localhost
Connection: keep-alive
Cache-Control: max-age=0
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
有人可以告诉我解析此请求的基本方法吗?
答案 0 :(得分:2)
这些字段名称中的每一个都应该通过回车符然后换行符分隔,然后字段名称和值由冒号分隔。因此,假设您已经将响应作为字符串, 应该像
一样简单。fields = resp.split("\r\n")
fields = fields[1:] #ignore the GET / HTTP/1.1
output = {}
for field in fields:
key,value = field.split(':')#split each line by http field name and value
output[key] = value
答案 1 :(得分:1)
您可以使用标准库中email.message.Message
模块的email
类。
通过修改您链接的问题中的answer,下面是解析HTTP标头的Python3示例。
假设您要创建一个包含所有标题字段的字典:
import email
import pprint
from io import StringIO
request_string = 'GET / HTTP/1.1\r\nHost: localhost\r\nConnection: keep-alive\r\nCache-Control: max-age=0\r\nUpgrade-Insecure-Requests: 1\r\nUser-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\r\nAccept-Encoding: gzip, deflate, sdch\r\nAccept-Language: en-US,en;q=0.8'
# pop the first line so we only process headers
_, headers = request_string.split('\r\n', 1)
# construct a message from the request string
message = email.message_from_file(StringIO(headers))
# construct a dictionary containing the headers
headers = dict(message.items())
# pretty-print the dictionary of headers
pprint.pprint(headers, width=160)
如果你在python提示符下运行它,结果将如下所示:
{'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate, sdch',
'Accept-Language': 'en-US,en;q=0.8',
'Cache-Control': 'max-age=0',
'Connection': 'keep-alive',
'Host': 'localhost',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36'}
答案 2 :(得分:0)
它们是处理标头的另一种方式,更简单,更安全。更面向对象。 #61189692请参见Parse raw HTTP Headers