Question

我最近开始了我的第一个多线程代码，我很感激一些评论。

它从一个缓冲区提供视频样本，该缓冲区由流解析器填充在后台（本问题范围之外）。如果缓冲区为空，则需要等待缓冲区级别变为可接受，然后继续。

代码用于 Silverlight 4 ，删除了一些错误检查：

// External class requests samples - can happen multiple times concurrently
protected override void GetSampleAsync()
{
    Interlocked.Add(ref getVideoSampleRequestsOutstanding, 1);                
}

// Runs on a background thread
void DoVideoPumping()
{
    do
    {
        if (getVideoSampleRequestsOutstanding > 0)
        {
            PumpNextVideoSample();

            // Decrement the counter
            Interlocked.Add(ref getVideoSampleRequestsOutstanding, -1);
        }
        else Thread.Sleep(0);  

    } while (!this.StopAllBackgroundThreads);
}       

void PumpNextVideoSample()
{
    // If the video sample buffer is empty, tell stream parser to give us more samples 
    bool MyVidBufferIsEmpty = false; bool hlsClientIsExhausted = false;
    ParseMoreSamplesIfMyVideoBufferIsLow(ref MyVidBufferIsEmpty, ref parserAtEndOfStream);

    if (parserAtEndOfStream)  // No more data, start running down buffers
        this.RunningDownVideoBuffer = true;
    else if (MyVidBufferIsEmpty)  
    {
        // Buffer is empty, wait for samples
        WaitingOnEmptyVideoBuffer = true;
        WaitOnEmptyVideoBuffer.WaitOne();
    }

    // Buffer is OK
    nextSample = DeQueueVideoSample(); // thread-safe, returns NULL if a problem

    // Send the sample to the external renderer
    ReportGetSampleCompleted(nextSample);

}

代码似乎运作良好。但是，我被告知使用Thread.Wait（...）是'邪恶的'：当没有请求样本时，我的代码不必要地循环，占用了CPU时间。

我的代码可以进一步优化吗？由于我的课程是针对需要样本的环境而设计的，潜在的“无意义循环”场景是否超过其当前设计的简单性？

非常感谢。

Answer 1

这看起来像是经典的生产者/消费者模式。解决此问题的常用方法是使用所谓的阻塞队列。

.net的4.0版为这类问题引入了一套高效，设计良好的concurrent collection classes。我认为BlockingCollection<T>将满足您目前的需求。

如果您无权访问.net 4.0，则有许多网站包含阻止队列的实现。我个人的标准参考是Joe Duffy的书，Concurrent Programming on Windows。一个好的开始是Marc Gravell's blocking queue presented here in Stack Overflow。

使用阻塞队列的第一个好处是你停止使用忙等待循环，对Sleep()的hacky调用等。使用阻塞队列来避免这种代码总是一个好主意。

但是，我认为使用阻塞队列会带来更重要的好处。目前，生成工作项，使用它们和处理队列的代码都是混合在一起的。如果正确使用阻塞队列，那么最终会得到更好的因子代码，它将算法的各个组成部分分开：队列，生产者和消费者。

Answer 2

您有一个主要问题：Thread.Sleep()

它具有~20ms的粒度，这是视频的粗略。此外，Sleep(0)存在可能存在较低优先级线程[]的问题。

更好的方法是等待Waithandle，最好是内置到队列中。

Answer 3

Blocking queue是一个阻塞队列的简单例子主要关键是线程需要与信号协调，而不是通过检查计数器的值或数据结构的状态。任何检查都需要资源（CPU），因此您需要信号（Monitor.Wait和Monitor.Pulse）。

Answer 4

您可以使用AutoResetEvent而不是手动thread.sleep。这样做很简单：

AutoResetEvent e;
void RequestSample()
{
    Interlocked.Increment(ref requestsOutstanding);
    e.Set(); //also set this when StopAllBackgroundThreads=true!
}

void Pump()
{
    while (!this.StopAllBackgroundThreads) {
        e.WaitOne();
        int leftOver = Interlocked.Decrement(ref requestsOutstanding);
        while(leftOver >= 0) {
            PumpNextVideoSample();
            leftOver = Interlocked.Decrement(ref requestsOutstanding);
        }
        Interlocked.Increment(ref requestsOutstanding);
    }
}

请注意，实现信号量可能更具吸引力。基本上;在您的场景中，无论如何，同步开销几乎都是零，并且更简单的编程模型是值得的。有了信号量，你就会有这样的事情：

MySemaphore sem;
void RequestSample()
{
    sem.Release();
}

void Pump()
{
    while (true) {
        sem.Acquire();
        if(this.StopAllBackgroundThreads) break;
        PumpNextVideoSample();
    }
}

......我说简单是值得的！

e.g。一个简单的信号量实现：

public sealed class SimpleSemaphore
{
    readonly object sync = new object();
    public int val;

    public void WaitOne()
    {
        lock(sync) {
            while(true) {
                if(val > 0) {
                    val--;
                    return;
                }
                Monitor.Wait(sync);
            }
        }
    }

    public void Release()
    {
        lock(sync) {
            if(val==int.MaxValue)
                throw new Exception("Too many releases without waits.");
            val++;
            Monitor.Pulse(sync);
        }
    }
}

在一个微不足道的基准测试中，这个简单的实现需要~1.7秒，其中Semaphore需要7.5并且SemaphoreSlim需要1.1;换句话说，这是令人惊讶的合理。

我可以更好地优化这种并发性吗？

4 个答案: