内存流是否像文件流一样阻塞

时间:2009-09-25 06:33:41

标签: c# stream memorystream

我正在使用一个库,它要求我提供一个实现此接口的对象:

public interface IConsole {
    TextWriter StandardInput { get; }
    TextReader StandardOutput { get; }
    TextReader StandardError { get; }
}

该对象的读者随后被图书馆用于:

IConsole console = new MyConsole();
int readBytes = console.StandardOutput.Read(buffer, 0, buffer.Length);

通常,实现IConsole的类具有来自外部进程的StandardOutput流。在这种情况下,console.StandardOutput.Read通过阻塞来调用,直到有一些数据写入StandardOutput流。

我要做的是创建一个测试IConsole实现,该实现使用MemoryStreams和echo在StandardInput上显示的任何内容返回到StandardInput。我试过了:

MemoryStream echoOutStream = new MemoryStream();
StandardOutput = new StreamReader(echoOutStream);

但问题是console.StandardOutput.Read将返回0而不是阻塞,直到有一些数据。无论如何,如果没有可用数据或者我可以使用的内存流有什么不同,我可以让MemoryStream阻塞吗?

5 个答案:

答案 0 :(得分:11)

受你的回答启发,这是我的多线程,多写版本:

public class EchoStream : MemoryStream
{
    private readonly ManualResetEvent _DataReady = new ManualResetEvent(false);
    private readonly ConcurrentQueue<byte[]> _Buffers = new ConcurrentQueue<byte[]>();

    public bool DataAvailable{get { return !_Buffers.IsEmpty; }}

    public override void Write(byte[] buffer, int offset, int count)
    {
        _Buffers.Enqueue(buffer);
        _DataReady.Set();
    }

    public override int Read(byte[] buffer, int offset, int count)
    {
        _DataReady.WaitOne();

        byte[] lBuffer;

        if (!_Buffers.TryDequeue(out lBuffer))
        {
            _DataReady.Reset();
            return -1;
        }

        if (!DataAvailable)
            _DataReady.Reset();

        Array.Copy(lBuffer, buffer, lBuffer.Length);
        return lBuffer.Length;
    }
}

对于您的版本,您应该在写入时读取流,而不能连续写入。我的版本缓冲了ConcurrentQueue中的任何写入缓冲区(将其更改为简单的队列并将其锁定非常简单)

答案 1 :(得分:9)

最后,通过继承MemoryStream并接管Read和Write方法,我找到了一种简单的方法。

public class EchoStream : MemoryStream {

    private ManualResetEvent m_dataReady = new ManualResetEvent(false);
    private byte[] m_buffer;
    private int m_offset;
    private int m_count;

    public override void Write(byte[] buffer, int offset, int count) {
        m_buffer = buffer;
        m_offset = offset;
        m_count = count;
        m_dataReady.Set();
    }

    public override int Read(byte[] buffer, int offset, int count) {
        if (m_buffer == null) {
            // Block until the stream has some more data.
            m_dataReady.Reset();
            m_dataReady.WaitOne();    
        }

        Buffer.BlockCopy(m_buffer, m_offset, buffer, offset, (count < m_count) ? count : m_count);
        m_buffer = null;
        return (count < m_count) ? count : m_count;
    }
}

答案 2 :(得分:3)

匿名管道流像文件流一样阻塞,并且应比提供的示例代码处理更多的边缘情况。

这是一个演示此行为的单元测试。

var cts = new CancellationTokenSource();
using (var pipeServer = new AnonymousPipeServerStream(PipeDirection.Out))
using (var pipeStream = new AnonymousPipeClientStream(PipeDirection.In, pipeServer.ClientSafePipeHandle))
{
    var buffer = new byte[1024];
    var readTask = pipeStream.ReadAsync(buffer, 0, buffer.Length, cts.Token);
    Assert.IsFalse(readTask.IsCompleted, "Read already complete");

    // Cancelling does NOT unblock the read
    cts.Cancel();
    Assert.IsFalse(readTask.IsCanceled, "Read cancelled");

    // Only sending data does
    pipeServer.WriteByte(42);
    var bytesRead = await readTask;
    Assert.AreEqual(1, bytesRead);
}

答案 3 :(得分:1)

我将再添加一个完善的EchoStream版本。这是其他两个版本的结合,再加上评论中的一些建议。

更新-我已经对该EchoStream进行了连续50天以上的数据测试,测试了超过50 terrabytes的数据。该测试使它位于网络流和ZStandard压缩流之间。异步也已经过测试,这为表面带来了罕见的悬挂条件。似乎内置的System.IO.Stream不会期望一个人同时在同一流上同时调用ReadAsync和WriteAsync,如果没有可用数据,则可能导致它挂起,因为这两个调用都使用相同的内部变量。因此,我不得不重写这些功能,从而解决了挂起的问题。

此版本具有以下增强功能:

1)这是使用System.IO.Stream基类而不是MemoryStream从头开始编写的。

2)构造函数可以设置最大队列深度,如果达到此最大级别,则流写入将被阻塞,直到执行读取为止,该读取将队列深度降至最大级别以下(无限制= 0,默认= 10)。

3)读/写数据时,缓冲区偏移和计数现在已被接受。此外,您可以使用比写操作更小的缓冲区来调用读操作,而不会引发异常或丢失数据。在循环中使用BlockCopy填充字节,直到满足计数为止。

4)有一个称为AlwaysCopyBuffer的公共属性,该属性在Write函数中复制缓冲区。将此设置为true将可以安全地允许在调用Write之后重用字节缓冲区。

5)有一个称为ReadTimeout / WriteTimeout的公共属性,它控制Read / Write函数在返回0(默认值= Infinite,-1)之前将阻塞多长时间。

6)使用BlockingCollection <>类,该类在后台将ConcurrentQueue和AutoResetEvent类组合在一起。最初我使用这两个类,但是存在一种罕见的情况,您会发现在将数据放入Enqueued()之后,当AutoResetEvent允许Read()中的线程通过时,该数据将立即不可用。每传输500GB数据大约发生一次。解决方法是进入睡眠状态,然后再次检查数据。有时,Sleep(0)可以工作,但是在CPU使用率很高的极端情况下,在显示数据之前它可能高达Sleep(1000)。切换到BlockingCollection <>后,它有很多额外的代码可以轻松地处理这些问题而没有问题。

7)经过测试,该线程对于同时进行异步读写是线程安全的。

using System;
using System.IO;
using System.Threading.Tasks;
using System.Threading;
using System.Collections.Concurrent;

public class EchoStream : Stream
{
    public override bool CanTimeout { get; } = true;
    public override int ReadTimeout { get; set; } = Timeout.Infinite;
    public override int WriteTimeout { get; set; } = Timeout.Infinite;
    public override bool CanRead { get; } = true;
    public override bool CanSeek { get; } = false;
    public override bool CanWrite { get; } = true;

    public bool CopyBufferOnWrite { get; set; } = false;

    private readonly object _lock = new object();

    // Default underlying mechanism for BlockingCollection is ConcurrentQueue<T>, which is what we want
    private readonly BlockingCollection<byte[]> _Buffers;
    private int _maxQueueDepth = 10;

    private byte[] m_buffer = null;
    private int m_offset = 0;
    private int m_count = 0;

    private bool m_Closed = false;
    public override void Close()
    {
        m_Closed = true;

        // release any waiting writes
        _Buffers.CompleteAdding();
    }

    public bool DataAvailable
    {
        get
        {
            return _Buffers.Count > 0;
        }
    }

    private long _Length = 0L;
    public override long Length
    {
        get
        {
            return _Length;
        }
    }

    private long _Position = 0L;
    public override long Position
    {
        get
        {
            return _Position;
        }
        set
        {
            throw new NotImplementedException();
        }
    }

    public EchoStream() : this(10)
    {
    }

    public EchoStream(int maxQueueDepth)
    {
        _maxQueueDepth = maxQueueDepth;
        _Buffers = new BlockingCollection<byte[]>(_maxQueueDepth);
    }

    // we override the xxxxAsync functions because the default base class shares state between ReadAsync and WriteAsync, which causes a hang if both are called at once
    public new Task WriteAsync(byte[] buffer, int offset, int count)
    {
        return Task.Run(() => Write(buffer, offset, count));
    }

    // we override the xxxxAsync functions because the default base class shares state between ReadAsync and WriteAsync, which causes a hang if both are called at once
    public new Task<int> ReadAsync(byte[] buffer, int offset, int count)
    {
        return Task.Run(() =>
        {
            return Read(buffer, offset, count);
        });
    }

    public override void Write(byte[] buffer, int offset, int count)
    {
        if (m_Closed || buffer.Length - offset < count || count <= 0)
            return;

        byte[] newBuffer;
        if (!CopyBufferOnWrite && offset == 0 && count == buffer.Length)
            newBuffer = buffer;
        else
        {
            newBuffer = new byte[count];
            System.Buffer.BlockCopy(buffer, offset, newBuffer, 0, count);
        }
        if (!_Buffers.TryAdd(newBuffer, WriteTimeout))
            throw new TimeoutException("EchoStream Write() Timeout");

        _Length += count;
    }

    public override int Read(byte[] buffer, int offset, int count)
    {
        if (count == 0)
            return 0;
        lock (_lock)
        {
            if (m_count == 0 && _Buffers.Count == 0)
            {
                if (m_Closed)
                    return -1;

                if (_Buffers.TryTake(out m_buffer, ReadTimeout))
                {
                    m_offset = 0;
                    m_count = m_buffer.Length;
                }
                else
                    return m_Closed ? -1 : 0;
            }

            int returnBytes = 0;
            while (count > 0)
            {
                if (m_count == 0)
                {
                    if (_Buffers.TryTake(out m_buffer, 0))
                    {
                        m_offset = 0;
                        m_count = m_buffer.Length;
                    }
                    else
                        break;
                }

                var bytesToCopy = (count < m_count) ? count : m_count;
                System.Buffer.BlockCopy(m_buffer, m_offset, buffer, offset, bytesToCopy);
                m_offset += bytesToCopy;
                m_count -= bytesToCopy;
                offset += bytesToCopy;
                count -= bytesToCopy;

                returnBytes += bytesToCopy;
            }

            _Position += returnBytes;

            return returnBytes;
        }
    }

    public override int ReadByte()
    {
        byte[] returnValue = new byte[1];
        return (Read(returnValue, 0, 1) <= 0 ? -1 : (int)returnValue[0]);
    }

    public override void Flush()
    {
    }

    public override long Seek(long offset, SeekOrigin origin)
    {
        throw new NotImplementedException();
    }

    public override void SetLength(long value)
    {
        throw new NotImplementedException();
    }
}

答案 4 :(得分:0)

这是我对上面发布的 EchoStream 的看法。它处理写入和读取的偏移和计数参数。

public class EchoStream : MemoryStream
{
    private readonly ManualResetEvent _DataReady = new ManualResetEvent(false);
    private readonly ConcurrentQueue<byte[]> _Buffers = new ConcurrentQueue<byte[]>();

    public bool DataAvailable { get { return !_Buffers.IsEmpty; } }

    public override void Write(byte[] buffer, int offset, int count)
    {
        _Buffers.Enqueue(buffer.Skip(offset).Take(count).ToArray());
        _DataReady.Set();
    }

    public override int Read(byte[] buffer, int offset, int count)
    {
        _DataReady.WaitOne();

        byte[] lBuffer;

        if (!_Buffers.TryDequeue(out lBuffer))
        {
            _DataReady.Reset();
            return -1;
        }

        if (!DataAvailable)
            _DataReady.Reset();

        Array.Copy(lBuffer, 0, buffer, offset, Math.Min(lBuffer.Length, count));
        return lBuffer.Length;
    }
}

我能够使用这个类对 System.IO.Pipelines 实现进行单元测试。我需要一个 MemoryStream 可以连续模拟多个读取调用而不会到达流的末尾。