如何在Windows服务中实现连续的生产者 - 消费者模式

时间:2017-03-21 04:21:33

标签: c# .net asynchronous windows-services tpl-dataflow

这是我正在尝试做的事情:

  • 在内存中保留需要处理的项目(即IsProcessed = 0
  • 每隔5秒,从数据库中获取未处理的项目,如果它们尚未在队列中,则添加它们
  • 从队列中连续拉取项目,处理它们,每次处理项目时,在数据库中更新它(IsProcessed = 1
  • 全部“尽可能平行”

我的服务构造函数如

public MyService()
{
    Ticker.Elapsed += FillQueue;
}

我在服务启动时启动该计时器,如

protected override void OnStart(string[] args)
{
    Ticker.Enabled = true;
    Task.Run(() => { ConsumeWork(); });
}

和我的FillQueue就像

private static async void FillQueue(object source, ElapsedEventArgs e)   
{
    var items = GetUnprocessedItemsFromDb();
    foreach(var item in items)
    {
        if(!Work.Contains(item))
        {
            Work.Enqueue(item);
        }   
    }
}

和我的ConsumeWork就像

private static void ConsumeWork()
{
    while(true)
    {
        if(Work.Count > 0)
        {
            var item = Work.Peek();
            Process(item);
            Work.Dequeue();
        }
        else
        {
            Thread.Sleep(500);
        }
    }
}

然而,这可能是一个天真的实现,我想知道.NET是否有任何类型的类正是我对这种情况所需要的。

2 个答案:

答案 0 :(得分:3)

虽然@JSteward的回答是一个好的开始,但您可以使用mixing up TPL-DataflowRx.NET extensions来改进它,因为数据流块可能很容易成为您数据的观察者,并且使用Rx Timer,您的工作量会少得多(Rx.Timer explanation)。

我们可以根据您的需要调整MSDN article,例如:

private const int EventIntervalInSeconds = 5;
private const int DueIntervalInSeconds = 60;

var source =
    // sequence of Int64 numbers, starting from 0
    // https://msdn.microsoft.com/en-us/library/hh229435.aspx
    Observable.Timer(
        // fire first event after 1 minute waiting
        TimeSpan.FromSeconds(DueIntervalInSeconds),
        // fire all next events each 5 seconds
        TimeSpan.FromSeconds(EventIntervalInSeconds))
    // each number will have a timestamp
    .Timestamp()
    // each time we select some items to process
    .SelectMany(GetItemsFromDB)
    // filter already added
    .Where(i => !_processedItemIds.Contains(i.Id));

var action = new ActionBlock<Item>(ProcessItem, new ExecutionDataflowBlockOptions
    {
        // we can start as many item processing as processor count
        MaxDegreeOfParallelism = Environment.ProcessorCount,
    });

IDisposable subscription = source.Subscribe(action.AsObserver());

此外,您对已经处理的项目的检查不是很准确,因为在您完成处理后,项目有可能从db中选择为未处理,但是没有在数据库中更新它。在这种情况下,项目将从Queue<T>中删除,之后又由生产者添加,这就是为什么我已将ConcurrentBag<T>添加到此解决方案中(HashSet<T>不是线程 - 安全的):

private static async Task ProcessItem(Item item)
{
    if (_processedItemIds.Contains(item.Id))
    {
        return;
    }

    _processedItemIds.Add(item.Id);
    // actual work here

    // save item as processed in database

    // we need to wait to ensure item not to appear in queue again 
    await Task.Delay(TimeSpan.FromSeconds(EventIntervalInSeconds * 2));

    // clear the processed cache to reduce memory usage
    _processedItemIds.Remove(item.Id);
}

public class Item
{
    public Guid Id { get; set; }
}

// temporary cache for items in process
private static ConcurrentBag<Guid> _processedItemIds = new ConcurrentBag<Guid>();

private static IEnumerable<Item> GetItemsFromDB(Timestamped<long> time)
{
    // log event timing
    Console.WriteLine($"Event # {time.Value} at {time.Timestamp}");

    // return items from DB
    return new[] { new Item { Id = Guid.NewGuid() } };
}

您可以通过其他方式实现缓存清理,例如,启动“GC”计时器,这将定期从缓存中删除已处理的项目。

要停止事件和处理项目,您应该Dispose订阅,可能Complete ActionBlock

subscription.Dispose();
action.Complete();

您可以在guidelines on github中找到有关Rx.Net的更多信息。

答案 1 :(得分:0)

您可以使用ActionBlock进行处理,它有一个可以发布工作的内置队列。您可以在此处阅读tpl-dataflow:Intro to TPL-DataflowIntroduction to Dataflow, Part 1。最后,这是一个快速的示例,让您前进。我遗漏了很多,但它至少应该让你开始。

using System;
using System.Threading;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;

namespace MyWorkProcessor {

    public class WorkProcessor {

        public WorkProcessor() {
            Processor = CreatePipeline();
        }    

        public async Task StartProcessing() {
            try {
                await Task.Run(() => GetWorkFromDatabase());
            } catch (OperationCanceledException) {
                //handle cancel
            }
        }

        private CancellationTokenSource cts {
            get;
            set;
        }

        private ITargetBlock<WorkItem> Processor {
            get;
        }

        private TimeSpan DatabasePollingFrequency {
            get;
        } = TimeSpan.FromSeconds(5);

        private ITargetBlock<WorkItem> CreatePipeline() {
            var options = new ExecutionDataflowBlockOptions() {
                BoundedCapacity = 100,
                CancellationToken = cts.Token
            };
            return new ActionBlock<WorkItem>(item => ProcessWork(item), options);
        }

        private async Task GetWorkFromDatabase() {
            while (!cts.IsCancellationRequested) {
                var work = await GetWork();
                await Processor.SendAsync(work);
                await Task.Delay(DatabasePollingFrequency);
            }
        }

        private async Task<WorkItem> GetWork() {
            return await Context.GetWork();
        }

        private void ProcessWork(WorkItem item) {
            //do processing
        }
    }
}