ZMQ中的HTTP服务器或如何使用pyzmq处理POST请求?

时间:2015-10-14 00:30:29

标签: python http zeromq pyzmq

我正在尝试使用ZMQ_STREAM套接字创建HTTP服务器。

当我做一个简单的POST请求时:

POST  HTTP/1.1
Host: localhost:5555
Cache-Control: no-cache
Postman-Token: 67004be5-56bc-c1a9-847a-7db3195c301d

Apples to Oranges!

以下是我使用pyzmq处理此问题的方法:

context = zmq.Context()
socket = context.socket(zmq.STREAM)
socket.bind("tcp://*:5555")

while True:
    # Get HTTP request
    parts = []
    id_, msg = socket.recv_multipart()  # [id, ''] or [id, http request]
    parts.append(id_)
    parts.append(msg)
    if not msg:
        # This is a new connection - this is just the identify frame (throw away id_)
        # The body will come next
        id_, msg = socket.recv_multipart() # [id, http request]
        parts.append(id_)
        parts.append(msg)

        end = socket.recv_multipart() # [id*, ''] <- some kind of junk? 
        parts.append(end)

    print("%s" % repr(parts))

以便parts列表出现:

['\x00\x80\x00\x00)', '', '\x00\x80\x00\x00)', 'POST / HTTP/1.1\r\nHost: localhost:5555\r\nConnection: keep-alive\r\nContent-Length: 18\r\nCache-Control: no-cache\r\nOrigin: chrome-extension://fhbjgbiflinjbdggehcddcbncdddomop\r\nContent-Type: text/plain;charset=UTF-8\r\nUser-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36\r\nPostman-Token: 9503fce9-8b1c-b39c-fb4d-3a7f21b509de\r\nAccept: */*\r\nAccept-Encoding: gzip, deflate\r\nAccept-Language: en-US,en;q=0.8,ru;q=0.6,uk;q=0.4\r\n\r\nApples to Oranges!', ['\x00\x80\x00\x00*', '']]

所以我理解:

  1. '\x00\x80\x00\x00)', ''是连接的标识。这最初由ZMQ_STREAM套接字设置。在随后的请求中似乎没有。
  2. \x00\x80\x00\x00)再次成为身份,这是我们在ZMQ_STREAM套接字来自客户端的后续请求中看到的内容。
  3. 然后是实际的HTTP请求
  4. 但最后一对幻数:['\x00\x80\x00\x00*', '']

    这意味着什么?

    参考文献:

    1. http://api.zeromq.org/4-0:zmq-socket
    2. HTTP 1.1规范:http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html

1 个答案:

答案 0 :(得分:5)

  

但最后一对幻数:['\ x00 \ x80 \ x00 \ x00 *','']   这到底意味着什么?

这是一个新连接,带有新的连接ID。连接id是一个整数计数器,你可以看到使用Python内置ord来查看ord(')') = 41ord('*') = 42,这是序列中的下一个数字。

使用ZMQ_STREAM编写HTTP服务器时,必须要小心,因为它比建立连接后只接收一条消息更复杂。 问题主要在于您无法保证请求将完成;身体可以通过几条消息到达大块。您将不得不阅读HTTP标头并处理接收正文。

以下是处理来自curl的POST请求的示例:

from traceback import print_exc
import zmq
from tornado.httputil import HTTPHeaders

class BadRequest(Exception):
    pass

class ConnectionLost(Exception):
    pass

def parse_request(request):
    """Parse a request verp, path, and headers"""
    first_line, header_lines = request.split(b'\r\n', 1)
    verb, path, proto = first_line.decode('utf8').split()
    headers = HTTPHeaders.parse(header_lines.decode('utf8', 'replace'))
    return verb, path, headers


def recv_body(socket, headers, chunks, request_id):
    """Receive the body of a request"""
    if headers.get('expect', '').lower() == '100-continue':
        if 'Content-Length' not in headers:
            # Don't support chunked transfer: http://tools.ietf.org/html/rfc2616#section-3.6.1
            print("Only support specified-length requests")
            socket.send_multipart([
                request_id, b'HTTP/1.1 400 (Bad Request)\r\n\r\n',
                request_id, b'',
            ])
            msg = 1
            while msg != b'':
                # flush until new connection
                _, msg = socket.recv_multipart()
            raise BadRequest("Only support specified-length requests")

        socket.send_multipart([request_id, b'HTTP/1.1 100 (Continue)\r\n\r\n'], zmq.SNDMORE)

        content_length = int(headers['Content-Length'])
        print("Waiting to receive %ikB body" )
        while sum(len(chunk) for chunk in chunks) < content_length:
            id_, msg = socket.recv_multipart()
            if msg == b'':
                raise ConnectionLost("Disconnected")
            if id_ != request_id:
                raise ConnectionLost("Received data from wrong ID: %s != %s" % (id_, request_id))
            chunks.append(msg)
    return b''.join(chunks)


print(zmq.__version__, zmq.zmq_version())


socket = zmq.Context().socket(zmq.STREAM)
socket.bind("tcp://*:5555")


while True:
    # Get HTTP request
    request_id, msg = socket.recv_multipart()
    if msg == b'':
        continue
    chunks = []
    try:
        request, first_chunk = msg.split(b'\r\n\r\n', 1)
        if first_chunk:
            chunks.append(first_chunk)
        verb, path, headers = parse_request(request)
        print(verb, path)
        print("Headers:")
        for key, value in headers.items():
            print('  %s: %s' % (key, value))
        body = recv_body(socket, headers, chunks, request_id)
        print("Body: %r" % body)
    except BadRequest as e:
        print("Bad Request: %s" % e)
    except ConnectionLost as e:
        print("Connection Lost: %s" % e)
    except Exception:
        print("Failed to handle request", msg)
        print_exc()
        socket.send_multipart([
            request_id, b'HTTP/1.1 500 (OK)\r\n\r\n',
            request_id, b''])
    else:
        socket.send_multipart([
            request_id, b'HTTP/1.1 200 (OK)\r\n\r\n',
            request_id, b''])

此案例的相关逻辑在recv_body方法中,该方法读取标题并继续重新获取正文的块直到完成。

坦率地说,我认为使用ZMQ_STREAM在Python中编写HTTP服务器并不是很有意义。您可以将zmq套接字与现有的Python事件循环集成,并重用已经建立的HTTP库,因此您不必处理重新发明此特定轮的问题。例如,pyzmq与tornado eventloop一起玩得非常好,你可以在同一个应用程序中一起使用zmq套接字和tornado http处理程序。