没有socket.makefile()的python套接字readline

时间:2015-03-13 02:25:12

标签: python sockets

我正在尝试解析HTTP请求行(例如' GET / HTTP / 1.1 \ r \ n '),这很容易使用socket.makefile()。readline() function(BaseHTTPRequestHandler使用它),如:

print sock.makefile().readline()

不幸的是,正如documentation所说,当使用makefile()时,套接字必须处于阻塞模式(它不能超时);我怎样才能实现一个readline() - 类似的函数,在不使用makefile()文件对象接口的情况下执行相同的操作,而不是读取超过需要的内容(因为它会丢弃我之后需要的数据)?

一个非常低效的例子:

request_line = ""
while not request_line.endswith('\n'):
    request_line += sock.recv(1)
print request_line 

3 个答案:

答案 0 :(得分:2)

怎么样:

import StringIO

buff = StringIO.StringIO(2048)          # Some decent size, to avoid mid-run expansion
while True:
    data = sock.recv()                  # Pull what it can
    buff.write(data)                    # Append that segment to the buffer
    if '\n' in data: break              # If that segment had '\n', break

# Get the buffer data, split it over newlines, print the first line
print buff.getvalue().splitlines()[0]

这种方法避免了非常昂贵的字符串构建。它还从套接字中提取尽可能多的数据。

答案 1 :(得分:2)

SocketStreamReader

这是一个不使用asyncio的(缓冲的)行阅读器。它可以用作 socket 的“同步”基于 asyncio.StreamReader 的替代品。

import socket
from asyncio import IncompleteReadError  # only import the exception class


class SocketStreamReader:
    def __init__(self, sock: socket.socket):
        self._sock = sock
        self._recv_buffer = bytearray()

    def read(self, num_bytes: int = -1) -> bytes:
        raise NotImplementedError

    def readexactly(self, num_bytes: int) -> bytes:
        buf = bytearray(num_bytes)
        pos = 0
        while pos < num_bytes:
            n = self._recv_into(memoryview(buf)[pos:])
            if n == 0:
                raise IncompleteReadError(bytes(buf[:pos]), num_bytes)
            pos += n
        return bytes(buf)

    def readline(self) -> bytes:
        return self.readuntil(b"\n")

    def readuntil(self, separator: bytes = b"\n") -> bytes:
        if len(separator) != 1:
            raise ValueError("Only separators of length 1 are supported.")

        chunk = bytearray(4096)
        start = 0
        buf = bytearray(len(self._recv_buffer))
        bytes_read = self._recv_into(memoryview(buf))
        assert bytes_read == len(buf)

        while True:
            idx = buf.find(separator, start)
            if idx != -1:
                break

            start = len(self._recv_buffer)
            bytes_read = self._recv_into(memoryview(chunk))
            buf += memoryview(chunk)[:bytes_read]

        result = bytes(buf[: idx + 1])
        self._recv_buffer = b"".join(
            (memoryview(buf)[idx + 1 :], self._recv_buffer)
        )
        return result

    def _recv_into(self, view: memoryview) -> int:
        bytes_read = min(len(view), len(self._recv_buffer))
        view[:bytes_read] = self._recv_buffer[:bytes_read]
        self._recv_buffer = self._recv_buffer[bytes_read:]
        if bytes_read == len(view):
            return bytes_read
        bytes_read += self._sock.recv_into(view[bytes_read:])
        return bytes_read

用法:

reader = SocketStreamReader(sock)
line = reader.readline()

答案 2 :(得分:0)

这是我用Python 3编写的解决方案。在此示例中,我使用io.BytesIO.read()而不是socket.recv(),但想法是相同的

CHUNK_SIZE = 16  # you can set it larger or smaller
buffer = bytearray()
while True:
  chunk = stream.read(CHUNK_SIZE)
  buffer.extend(chunk)
  if b'\n' in chunk or not chunk:
    break
firstline = buffer[:buffer.find(b'\n')]

但是,消息的其余部分部分在缓冲区中,部分在套接字中等待。您可以继续将内容写到缓冲区中并从缓冲区中读取内容,以将整个请求合并为一个片段(除非您解析大量请求,否则应该没问题) 或者您可以用生成器将其包装起来,然后逐部分阅读

def reader(buffer, stream):
  yield buffer[buffer.find(b'\n') + 1:]
  while True:
    chunk = stream.read(2048)
    if not chunk: break
    yield chunk