Question

我的应用程序从TCP套接字读取字节并需要缓冲它们，以便我可以稍后从中提取消息。由于TCP的性质，我可能在一次读取中得到部分或多个消息，因此在每次读取之后，我想检查缓冲区并提取尽可能多的完整消息。

因此，我想要一个允许我执行以下操作的课程：

向其添加任意byte []数据
检查内容而不使用它，特别是检查内容量并搜索某个字节或字节的存在
以byte []的形式提取和使用部分数据，而将其余部分留在那里以供将来阅读

我希望我想要的东西可以用.NET库中的一个或多个现有类来完成，但我不确定哪些类。 System.IO.MemoryStream 看起来接近我想要的，但是（a）不清楚它是否适合用作缓冲区（读取数据是否从容量中删除？）和（b）读取和写入似乎发生在同一个地方 - “流的当前位置是下一次读取或写入操作可能发生的位置。” - 这不是我的意思想。我需要写到最后并从前面阅读。

Answer 1

我建议您在引擎盖下使用MemoryStream，但将其封装在另一个存储的类中：

MemoryStream
当前的“阅读”位置
当前“消耗”的位置

然后会揭露：

写入：将流的位置设置为结尾，写入数据，将流的位置设置回读取位置
读取：读取数据，将读取位置设置为流的位置
消耗：更新消耗的位置（根据您尝试消费的方式详细说明）;如果消耗位置高于某个阈值，则将现有缓冲数据复制到新的MemoryStream并更新所有变量。（您可能不希望在每个使用请求上复制缓冲区。）

请注意，如果没有额外的同步，这些都不是线程安全的。

Answer 2

只需使用一个大字节数组和Array.Copy - 它应该可以解决问题。如果没有，请使用List<byte>。

如果你使用数组，你必须自己实现一个索引（你在那里复制其他数据）（同样用于检查内容大小），但它很简单。

如果您感兴趣：这里是“循环缓冲区”的简单实现。测试应该运行（我在它上面进行了几次单元测试，但没有检查所有关键路径）：

public class ReadWriteBuffer
{
    private readonly byte[] _buffer;
    private int _startIndex, _endIndex;

    public ReadWriteBuffer(int capacity)
    {
        _buffer = new byte[capacity];
    }

    public int Count
    {
        get
        {
            if (_endIndex > _startIndex)
                return _endIndex - _startIndex;
            if (_endIndex < _startIndex)
                return (_buffer.Length - _startIndex) + _endIndex;
            return 0;
        }
    }

    public void Write(byte[] data)
    {
        if (Count + data.Length > _buffer.Length)
            throw new Exception("buffer overflow");
        if (_endIndex + data.Length >= _buffer.Length)
        {
            var endLen = _buffer.Length - _endIndex;
            var remainingLen = data.Length - endLen;

            Array.Copy(data, 0, _buffer, _endIndex, endLen);
            Array.Copy(data, endLen, _buffer, 0, remainingLen);
            _endIndex = remainingLen;
        }
        else
        {
            Array.Copy(data, 0, _buffer, _endIndex, data.Length);
            _endIndex += data.Length;
        }
    }

    public byte[] Read(int len, bool keepData = false)
    {
        if (len > Count)
            throw new Exception("not enough data in buffer");
        var result = new byte[len];
        if (_startIndex + len < _buffer.Length)
        {
            Array.Copy(_buffer, _startIndex, result, 0, len);
            if (!keepData)
                _startIndex += len;
            return result;
        }
        else
        {
            var endLen = _buffer.Length - _startIndex;
            var remainingLen = len - endLen;
            Array.Copy(_buffer, _startIndex, result, 0, endLen);
            Array.Copy(_buffer, 0, result, endLen, remainingLen);
            if (!keepData)
                _startIndex = remainingLen;
            return result;
        }
    }

    public byte this[int index]
    {
        get
        {
            if (index >= Count)
                throw new ArgumentOutOfRangeException();
            return _buffer[(_startIndex + index) % _buffer.Length];
        }
    }

    public IEnumerable<byte> Bytes
    {
        get
        {
            for (var i = 0; i < Count; i++)
                yield return _buffer[(_startIndex + i) % _buffer.Length];
        }
    }
}

请注意：代码“消耗”读取 - 如果您不希望只删除“_startIndex = ...”部分（或使重载可选参数和检查或其他）。

Answer 3

我认为BufferedStream是解决问题的方法。也可以通过调用Seek来获取未读len字节的数据。

BufferdStream buffer = new BufferedStream(tcpStream, size); // we have a buffer of size
...
...
while(...)
{
    buffer.Read(...);
    // do my staff
    // I read too much, I want to put back len bytes
    buffer.Seek(-len, SeekOrigin.End);
    // I shall come back and read later
}

成长记忆

与最初指定BufferedStream的{{1}}相反，size可以增长。

记住流式传输数据

MemoryStream始终保留所有dara-read，而MemoryStream仅保存一段流数据。

源流与字节数组

BufferedStream允许在MemoryStream方法中添加输入字节，以后可能会Write()。而Read()从构造函数中指定的另一个源流获取输入字节。

Answer 4

这是我刚才写的缓冲区的 another implementation ：

可调整大小：允许排队数据而不是抛出缓冲区溢出异常;
高效：使用单个缓冲区和Buffer.Copy操作来排队/出列数据

Answer 5

迟到了，但后人：

当我过去做过这个时，我采取了一种略微不同的方法。如果您的邮件具有固定的标头大小（告诉您正文中有多少字节），并且请记住网络流已经缓冲，我执行的操作两个阶段：

在流上读取标题的字节
根据标题
重复

这可以利用这样一个事实 - 对于一个流 - 当你要求＆＃39; n＆＃39;你永远不会得到更多的字节，所以你可以忽略很多我读过太多的选项，让我把它们放到一边，直到下一次＆＃39;的问题。

现在这不是整个故事，公平。我在流上有一个底层的包装类来处理碎片问题（即如果要求4个字节，则不要返回，直到收到4个字节，或者流关闭）。但这一点相当容易。

在我看来，关键是将消息处理与流机制分离，如果您停止尝试将消息作为流中的单个ReadBytes（）使用，则生活变得更加简单。

[无论您的读取是阻塞还是异步（APM /等待），所有这一切都是正确的]

Answer 6

听起来你想从套接字读入MemoryStream缓冲区，然后从缓冲区“弹出”数据并在每次遇到某个字节时重置它。它看起来像这样：

void ReceiveAllMessages(Action<byte[]> messageReceived, Socket socket)
{
    var currentMessage = new MemoryStream();
    var buffer = new byte[128];

    while (true)
    {
        var read = socket.Receive(buffer, 0, buffer.Length);
        if (read == 0)
            break;     // Connection closed

        for (var i = 0; i < read; i++)
        {
            var currentByte = buffer[i];
            if (currentByte == END_OF_MESSAGE)
            {
                var message = currentMessage.ToByteArray();
                messageReceived(message);

                currentMessage = new MemoryStream();
            }
            else
            {
                currentMessage.Write(currentByte);
            }
        }
    }
}

Answer 7

您可以使用Stream包裹ConcurrentQueue<ArraySegment<byte>>来执行此操作（请记住，这只会使其前进）。但是我真的不喜欢在使用它之前将数据保存在内存中的想法;它可以让你了解有关邮件大小的一系列攻击（有意或无意）。您可能还想Google "circular buffer"。

实际上，您应该在收到数据后立即编写对数据有意义的代码：“推送解析”（例如，SAX支持此功能）。作为如何使用文本执行此操作的示例：

private Encoding _encoding;
private Decoder _decoder;
private char[] _charData = new char[4];

public PushTextReader(Encoding encoding)
{
    _encoding = encoding;
    _decoder = _encoding.GetDecoder();
}

// A single connection requires its own decoder
// and charData. That connection should never
// call this method from multiple threads
// simultaneously.
// If you are using the ReadAsyncLoop you
// don't need to worry about it.
public void ReceiveData(ArraySegment<byte> data)
{
    // The two false parameters cause the decoder
    // to accept 'partial' characters.
    var charCount = _decoder.GetCharCount(data.Array, data.Offset, data.Count, false);
    charCount = _decoder.GetChars(data.Array, data.Offset, data.Count, _charData, 0, false);
    OnCharacterData(new ArraySegment<char>(_charData, 0, charCount));
}

如果您必须能够在反序列化之前接受完整的消息，您可以使用MemoryMappedFile，其优势在于发送实体将无法取消-memory你的服务器。棘手的是将文件重置为零;因为那可能是一堆问题。解决这个问题的一种方法是：

TCP接收器结束

写入当前流。
如果流超过一定长度，请转到新的。

反序列化结束

从当前流中读取。
一旦你清空了流就摧毁了它。

TCP接收器端非常简单。解串器端需要一些基本的缓冲区拼接逻辑（记得使用Buffer.BlockCopy而不是Array.Copy）。

旁注：听起来像一个有趣的项目，如果我有时间并且记得我可以继续实施这个系统。

Answer 8

这里只有三个答案提供代码。其中一个是笨拙的，其他人没有回答这个问题。

这是一个可以复制和粘贴的课程：

/// <summary>
/// This class is a very fast and threadsafe FIFO buffer
/// </summary>
public class FastFifo
{
    private List<Byte> mi_FifoData = new List<Byte>();

    /// <summary>
    /// Get the count of bytes in the Fifo buffer
    /// </summary>
    public int Count
    {
        get 
        { 
            lock (mi_FifoData)
            {
                return mi_FifoData.Count; 
            }
        }
    }

    /// <summary>
    /// Clears the Fifo buffer
    /// </summary>
    public void Clear()
    {
        lock (mi_FifoData)
        {
            mi_FifoData.Clear();
        }
    }

    /// <summary>
    /// Append data to the end of the fifo
    /// </summary>
    public void Push(Byte[] u8_Data)
    {
        lock (mi_FifoData)
        {
            // Internally the .NET framework uses Array.Copy() which is extremely fast
            mi_FifoData.AddRange(u8_Data);
        }
    }

    /// <summary>
    /// Get data from the beginning of the fifo.
    /// returns null if s32_Count bytes are not yet available.
    /// </summary>
    public Byte[] Pop(int s32_Count)
    {
        lock (mi_FifoData)
        {
            if (mi_FifoData.Count < s32_Count)
                return null;

            // Internally the .NET framework uses Array.Copy() which is extremely fast
            Byte[] u8_PopData = new Byte[s32_Count];
            mi_FifoData.CopyTo(0, u8_PopData, 0, s32_Count);
            mi_FifoData.RemoveRange(0, s32_Count);
            return u8_PopData;
        }
    }

    /// <summary>
    /// Gets a byte without removing it from the Fifo buffer
    /// returns -1 if the index is invalid
    /// </summary>
    public int PeekAt(int s32_Index)
    {
        lock (mi_FifoData)
        {
            if (s32_Index < 0 || s32_Index >= mi_FifoData.Count)
                return -1;

            return mi_FifoData[s32_Index];
        }
    }
}

在C＃中缓冲字节数据

8 个答案:

成长记忆

记住流式传输数据

源流与字节数组