这个无锁的.NET队列线程安全吗?

时间:2009-10-09 02:16:08

标签: c# .net algorithm multithreading data-structures

我的问题是,下面是针对单读者单编写器队列类线程安全的类吗?这种队列称为无锁,即使队列已填满也会阻塞。数据结构的灵感来自StackOverflow的Marc Gravell's implementation of a blocking queue

结构的要点是允许单个线程将数据写入缓冲区,而另一个线程则读取数据。所有这些都需要尽快发生。

article at DDJ by Herb Sutter中描述了类似的数据结构,但实现是在C ++中。另一个区别是我使用了一个vanilla链表,我使用了一个链表列表。

我不是仅包含一段代码,而是将所有内容与允许的开源许可证(MIT许可证1.0)一起包含,以防有人发现它有用,并且想要使用它(原样或修改)。

这与Stack Overflow上有关如何创建阻塞并发队列的其他问题有关(请参阅Creating a blockinq Queue in .NETThread-safe blocking queue implementation in .NET)。

以下是代码:

using System;
using System.Collections.Generic;
using System.Threading;
using System.Diagnostics;

namespace CollectionSandbox
{
    /// This is a single reader / singler writer buffered queue implemented
    /// with (almost) no locks. This implementation will block only if filled 
    /// up. The implementation is a linked-list of arrays.
    /// It was inspired by the desire to create a non-blocking version 
    /// of the blocking queue implementation in C# by Marc Gravell
    /// https://stackoverflow.com/questions/530211/creating-a-blocking-queuet-in-net/530228#530228
    class SimpleSharedQueue<T> : IStreamBuffer<T>
    {
        /// Used to signal things are no longer full
        ManualResetEvent canWrite = new ManualResetEvent(true);

        /// This is the size of a buffer 
        const int BUFFER_SIZE = 512;

        /// This is the maximum number of nodes. 
        const int MAX_NODE_COUNT = 100;

        /// This marks the location to write new data to.
        Cursor adder;

        /// This marks the location to read new data from.
        Cursor remover;

        /// Indicates that no more data is going to be written to the node.
        public bool completed = false;

        /// A node is an array of data items, a pointer to the next item,
        /// and in index of the number of occupied items 
        class Node
        {
            /// Where the data is stored.
            public T[] data = new T[BUFFER_SIZE];

            /// The number of data items currently stored in the node.
            public Node next;

            /// The number of data items currently stored in the node.
            public int count;

            /// Default constructor, only used for first node.
            public Node()
            {
                count = 0;
            }

            /// Only ever called by the writer to add new Nodes to the scene
            public Node(T x, Node prev)
            {
                data[0] = x;
                count = 1;

                // The previous node has to be safely updated to point to this node.
                // A reader could looking at the point, while we set it, so this should be 
                // atomic.
                Interlocked.Exchange(ref prev.next, this);
            }
        }

        /// This is used to point to a location within a single node, and can perform 
        /// reads or writers. One cursor will only ever read, and another cursor will only
        /// ever write.
        class Cursor
        {
            /// Points to the parent Queue
            public SimpleSharedQueue<T> q;

            /// The current node
            public Node node;

            /// For a writer, this points to the position that the next item will be written to.
            /// For a reader, this points to the position that the next item will be read from.
            public int current = 0;

            /// Creates a new cursor, pointing to the node
            public Cursor(SimpleSharedQueue<T> q, Node node)
            {
                this.q = q;
                this.node = node;
            }

            /// Used to push more data onto the queue
            public void Write(T x)
            {
                Trace.Assert(current == node.count);

                // Check whether we are at the node limit, and are going to need to allocate a new buffer.
                if (current == BUFFER_SIZE)
                {
                    // Check if the queue is full
                    if (q.IsFull())
                    {
                        // Signal the canWrite event to false
                        q.canWrite.Reset();

                        // Wait until the canWrite event is signaled 
                        q.canWrite.WaitOne();
                    }

                    // create a new node
                    node = new Node(x, node);
                    current = 1;
                }
                else
                {
                    // If the implementation is correct then the reader will never try to access this 
                    // array location while we set it. This is because of the invariant that 
                    // if reader and writer are at the same node: 
                    //    reader.current < node.count 
                    // and 
                    //    writer.current = node.count 
                    node.data[current++] = x;

                    // We have to use interlocked, to assure that we incremeent the count 
                    // atomicalluy, because the reader could be reading it.
                    Interlocked.Increment(ref node.count);
                }
            }

            /// Pulls data from the queue, returns false only if 
            /// there 
            public bool Read(ref T x)
            {
                while (true)
                {
                    if (current < node.count)
                    {
                        x = node.data[current++];
                        return true;
                    }
                    else if ((current == BUFFER_SIZE) && (node.next != null))
                    {
                        // Move the current node to the next one.
                        // We know it is safe to do so.
                        // The old node will have no more references to it it 
                        // and will be deleted by the garbage collector.
                        node = node.next;

                        // If there is a writer thread waiting on the Queue,
                        // then release it.
                        // Conceptually there is a "if (q.IsFull)", but we can't place it 
                        // because that would lead to a Race condition.
                        q.canWrite.Set();

                        // point to the first spot                
                        current = 0;

                        // One of the invariants is that every node created after the first,
                        // will have at least one item. So the following call is safe
                        x = node.data[current++];
                        return true;
                    }

                    // If we get here, we have read the most recently added data.
                    // We then check to see if the writer has finished producing data.
                    if (q.completed)
                        return false;

                    // If we get here there is no data waiting, and no flagging of the completed thread.
                    // Wait a millisecond. The system will also context switch. 
                    // This will allow the writing thread some additional resources to pump out 
                    // more data (especially if it iself is multithreaded)
                    Thread.Sleep(1);
                }
            }
        }

        /// Returns the number of nodes currently used.
        private int NodeCount
        {
            get
            {
                int result = 0;
                Node cur = null;
                Interlocked.Exchange<Node>(ref cur, remover.node);

                // Counts all nodes from the remover to the adder
                // Not efficient, but this is not called often. 
                while (cur != null)
                {
                    ++result;
                    Interlocked.Exchange<Node>(ref cur, cur.next);
                }
                return result;
            }
        }

        /// Construct the queue.
        public SimpleSharedQueue()
        {
            Node root = new Node();
            adder = new Cursor(this, root);
            remover = new Cursor(this, root);
        }

        /// Indicate to the reader that no more data is going to be written.
        public void MarkCompleted()
        {
            completed = true;
        }

        /// Read the next piece of data. Returns false if there is no more data. 
        public bool Read(ref T x)
        {
            return remover.Read(ref x);
        }

        /// Writes more data.
        public void Write(T x)
        {
            adder.Write(x);
        }

        /// Tells us if there are too many nodes, and can't add anymore.
        private bool IsFull()
        {
            return NodeCount == MAX_NODE_COUNT;  
        }
    }
}

6 个答案:

答案 0 :(得分:7)

Microsoft Research CHESS应该被证明是测试实施的好工具。

答案 1 :(得分:4)

Sleep()的存在使得无锁方法毫无用处。面对无锁设计复杂性的唯一理由是需要绝对速度并避免信号量的成本。睡眠(1)的使用完全击败了这个目的。

答案 2 :(得分:3)

鉴于我找不到Interlocked.Exchange执行读取或写入块的任何引用,我会说不。我也会质疑为什么你想要无锁,因为很少有足够的好处来抵消它的复杂性。

微软在2009年GDC上发表了精彩的演讲,你可以获得幻灯片here

答案 3 :(得分:2)

注意双重检查 - 单锁模式(如上面引用的链接:http://www.yoda.arachsys.com/csharp/singleton.html

由Andrei Alexandrescu的“现代C ++设计”逐字引用

    非常有经验的多线程程序员知道即使是双重锁定模式,尽管在纸上是正确的,但在实践中并不总是正确的。在某些对称多处理器环境(具有所谓的宽松存储器模型的环境)中,写入以突发方式而不是逐个提交到主存储器。突发以地址递增的顺序发生,而不是按时间顺序发生。由于这种写入的重新排列,一次一个处理器看到的存储器看起来好像操作不是由另一个处理器以正确的顺序执行。具体而言,处理器执行的pInstance_赋值可能在Singleton对象完全初始化之前发生!因此,遗憾的是,已知双重锁定模式对于这种系统是有缺陷的

答案 4 :(得分:1)

我怀疑它不是线程安全的 - 想象一下以下场景:

两个主题输入cursor.Write。第一个在node = new Node(x, node);语句的真正一半中到达行if (current == BUFFER_SIZE)(但我们也假设current == BUFFER_SIZE)所以当1添加到current时,则另一个线程进入将通过if语句遵循另一条路径。现在想象一下,线程1失去了它的时间片,线程2得到它,然后继续输入if语句,错误地认为条件仍然存在。它本应该进入另一条路径。

我也没有运行这段代码,所以我不确定我的假设在这段代码中是否可行,但是如果它们是(即在current == BUFFER_SIZE时从多个线程输入cursor.Write),那么它可能很容易出现并发错误。

答案 5 :(得分:1)

首先,我想知道这两行顺序代码的假设:

                node.data[current++] = x;

                // We have to use interlocked, to assure that we incremeent the count 
                // atomicalluy, because the reader could be reading it.
                Interlocked.Increment(ref node.count);

什么说node.data []的新值已经提交到这个内存位置?它不存储在易失性存储器地址中,因此如果我理解正确,它可以被缓存吗?这不会导致'脏'读?可能还有其他地方也是如此,但这一点一目了然。

其次,包含以下内容的多线程代码:

Thread.Sleep(int);

......永远不是一个好兆头。如果需要,那么代码注定要失败,如果不需要它就是浪费。我真的希望他们完全删除这个API。意识到这是一个等待至少那段时间的请求。随着上下文切换的开销,你几乎肯定会等待更长时间。

第三,我完全不了解Interlock API在这里的使用。也许我累了,只是错过了重点;但我找不到两个线程读取和放大的潜在线程冲突写同一个变量?似乎我能找到的互锁交换的唯一用途是修改node.data []的内容来修复上面的#1。

最后,似乎实施有些过于复杂。我错过了整个Cursor / Node的观点,还是基本上和这个类做同样的事情? (注意:我没有尝试过,我也不认为这也是线程安全的,只是试图归结我认为你在做什么。)

class ReaderWriterQueue<T>
{
    readonly AutoResetEvent _readComplete;
    readonly T[] _buffer;
    readonly int _maxBuffer;
    int _readerPos, _writerPos;

    public ReaderWriterQueue(int maxBuffer)
    {
        _readComplete = new AutoResetEvent(true);
        _maxBuffer = maxBuffer;
        _buffer = new T[_maxBuffer];
        _readerPos = _writerPos = 0;
    }

    public int Next(int current) { return ++current == _maxBuffer ? 0 : current; }

    public bool Read(ref T item)
    {
        if (_readerPos != _writerPos)
        {
            item = _buffer[_readerPos];
            _readerPos = Next(_readerPos);
            return true;
        }
        else
            return false;
    }

    public void Write(T item)
    {
        int next = Next(_writerPos);

        while (next == _readerPos)
            _readComplete.WaitOne();

        _buffer[next] = item;
        _writerPos = next;
    }
}

所以我完全偏离了这里,并没有看到原始课程中的魔力?

我必须承认一件事,我鄙视线程。我见过最好的开发者失败了。本文给出了一个很好的例子,说明如何正确处理线程: http://www.yoda.arachsys.com/csharp/singleton.html