将昂贵的呼叫转换为异步,同时保持事件系统完整

时间:2018-12-15 01:50:15

标签: c# async-await

我在尝试优化一些旧代码时遇到了一些问题。总体情况是这样的:有一个“导出引擎”,它根据所需的输出旋转一些“ writer”对象。写入操作启动DataReader对象并预订其事件,以便它可以处理正在读取的数据。然后,它将在阅读器中启动长期运行的“ GetData”方法。这将从旧数据库中检索数据,这需要很长时间(!)。数据读取器处理返回的值并触发多个事件,这些事件使写入器可以处理数据。

下面提供了一个非常简化的DataReader伪代码示例。

class DataReader
{
    // delegates
    internal delegate void DataRowReadHandler(object sender, DataRowReadArgs e);
    internal delegate void DataProgressChangedHandler(object sender, DataProgressChangedArgs e);
    internal delegate void DataReadCompleteHandler(object sender, DataReadCompleteArgs e);
    // events
    internal event DataProgressChangedHandler DataProgressChanged;
    internal event DataReadCompleteHandler DataReadCompleted;
    internal event DataRowReadHandler DataRowRead;

    // this methods chomps on and on and raises an event when the database read returns something
    internal void GetData()
    {
        for (int totalrows = 0; totalrows < _cursor.RowCount; totalrows += _maxrows)
        {
            // I want to keep GetRawData running while the data it fetched is being processed
            string[][] rawdata = _cursor.GetRawData(_maxrows);

            // -- a ton of post-processing I want to do while database is being read--

            // and then report progress
            foreach (row in rawdata)
            {
                DataRowReadArgs args = new DataRowReadArgs(row.Index)
                OnDataRowRead(args); // raise event after each row
            }
            DataProgressChangedArgs args = new DataProgressChangedArgs(batch, counter);
            OnDataProgressChanged(args); // raise event after each batch of rows
        }
        // report we're done
        DataReadCompleteArgs e = new DataReadCompleteArgs(counter);
        OnDataReadCompleted(e); // done with reading data
    }

    protected virtual void OnDataProgressChanged(DataProgressChangedArgs e)
    {
        DataProgressChangedHandler handler = DataProgressChanged;
        if (handler != null)
            handler(this, e);
    }

    protected virtual void OnDataReadCompleted(DataReadCompleteArgs e)
    {
        DataReadCompleteHandler handler = DataReadCompleted;
        if (handler != null)
            handler(this, e);
    }

    protected virtual void OnDataRowRead(DataRowReadArgs e)
    {
        DataRowReadHandler handler = DataRowReadRead;
        if (handler != null)
            handler(this, e);
    }
}

我想要的是:保持数据库读取(这将是最慢的)运行,并在查询结果可用时处理返回的数据。也就是说:在读取器中对数据进行后处理,触发事件,并在继续读取数据库的同时让写入器中的处理程序处理它们。理想情况下,我还希望一些取消标记在出现问题时停止读取,但首先要先进行。我不想触及许多类所依赖的基于事件的系统,我只希望读取的数据库并行运行,并让其余代码在有结果时做出响应。

我已经花了将近一个星期的时间来尝试使用await / async和TaskCompletionSource以及其他东西,但是似乎仍然无法解决这个问题。我已经接近了,实际上我设法编译了一个任务列表,将其提供给一个中间方法,该方法将在完成每个任务时对其进行处理,然后等待。

internal async Task GetDataAsync()
{
    IList<Task<string[][]>> tasks = CreateCursorReadTasks();
    var processingTasks = tasks.Select(AwaitAndProcessAsync).ToList();
    await Task.WhenAll(processingTasks);
    // this isn't 'awaited' in the sense I expected
    // also, what order are they performed in? The database is single-threaded, no queues, nothing
    // I need to fire my 'done' event only after all tasks have finished
}

private IList<Task<string[][]>> CreateCursorReadTasks()
{
    IList<Task<string[][]>> retval = new List<Task<string[][]>>();
    for (int totalrows = 0; totalrows < this._cursor.RowCount; totalrows += _maxrows)
    {
        retval.Add(Task.Run(() => _cursor.GetRawData(_maxrows)));
    }
    return retval;
}

internal async Task AwaitAndProcessAsync(Task<string[][]> task)
{
    string[][] rawdata = await task;
    // Do all the post-processing and fire the events like in the GetData method of DataReader
}

除了所有这些看似过于复杂之外,我还遇到了两个问题:a)即使我订阅了它们,我的事件处理程序似乎都为空; b)我不在哪里/如何引发已完成的事件。

我的问题是:当您在GetData类中查看我的DataReader方法时,您如何建议我让非常昂贵的数据库调用异步运行?

2 个答案:

答案 0 :(得分:0)

您的伪代码似乎没问题,我尝试使用一个程序进行验证,其中我使用checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for a thread-safe mkdir -p... /usr/bin/mkdir -p checking for gawk... gawk checking whether make sets $(MAKE)... no checking whether make supports nested variables... yes checking for perl... /usr/bin/perl checking for perl >= 5.8.1... 5.26.2 checking for XML::Parser... configure: error: XML::Parser perl module is required for intltool模拟了对数据库的调用,并且随时仅允许访问一项任务(以考虑数据库的事实是单线程)。

Task.Delay(5000)

输出表明它可以正确运行。在示例中,启动了4个任务,每次访问数据库需要5秒钟,而后处理则需要3秒钟(很明显)。总运行时间约为23秒(4 * 5 + 3),这意味着后期处理与数据库读取并行执行。事件也按预期触发。每次运行程序时,任务执行的顺序都不同。请参见以下程序输出:

class Program
{
    public static async Task Main(string[] args)
    {
        var dataReader = new DataReader();
        dataReader.DataProgressChanged += (s, e) => Log.D($"*** Event - Processed {e.TaskId}");
        dataReader.DataReadCompleted += (s, e) => Log.D("*** Event - Data read complete");

        await dataReader.GetDataAsync();

        Console.ReadKey();
    }
}

public class DataReader
{
    internal delegate void DataProgressChangedHandler(object sender, DataProgressChangedArgs e);
    internal delegate void DataReadCompleteHandler(object sender, DataReadCompleteArgs e);

    internal event DataProgressChangedHandler DataProgressChanged;
    internal event DataReadCompleteHandler DataReadCompleted;

    private SemaphoreSlim semaphore = new SemaphoreSlim(1);

    internal async Task GetDataAsync()
    {
        Log.D("Start");
        var tasks = CreateCursorReadTasks();
        var processingTasks = tasks.Select(AwaitAndProcessAsync).ToList();
        await Task.WhenAll(processingTasks);
        OnDataReadCompleted(new DataReadCompleteArgs());
    }

    private IList<ReadTaskWrapper> CreateCursorReadTasks()
    {
        var retval = new List<ReadTaskWrapper>();
        for (int totalrows = 0; totalrows < 4; totalrows++)
        {
            int taskId = totalrows;
            retval.Add(new ReadTaskWrapper
            {
                Task = Task.Run(async () => { return await SimulateDbReadAsync(taskId); }),
                Id = taskId
            });
        }
        return retval;
    }

    private async Task<string[][]> SimulateDbReadAsync(int taskId)
    {
        await semaphore.WaitAsync();
        Log.D($"Starting data read task {taskId}");
        await Task.Delay(5000);
        Log.D($"Finished data read task {taskId}");
        semaphore.Release();
        return new string[1][];
    }

    internal async Task AwaitAndProcessAsync(ReadTaskWrapper task)
    {
        string[][] rawdata = await task.Task;
        Log.D($"Start postprocessing of task {task.Id}");
        await Task.Delay(3000);
        Log.D($"Finished prostprocessing of task {task.Id}");

        OnDataProgressChanged(new DataProgressChangedArgs { TaskId = task.Id });
    }

    internal void OnDataProgressChanged(DataProgressChangedArgs args)
    {
        DataProgressChanged?.Invoke(this, args);
    }

    internal void OnDataReadCompleted(DataReadCompleteArgs args)
    {
        DataReadCompleted?.Invoke(this, args);
    }

    internal class DataProgressChangedArgs : EventArgs
    {
        public int TaskId { get; set; }
    }

    internal class DataReadCompleteArgs : EventArgs
    {
    }
}

public class ReadTaskWrapper
{
    public int Id { get; set; }
    public Task<string[][]> Task { get; set; }
}

public static class Log
{
    public static void D(string msg)
    {
        Console.WriteLine($"{DateTime.Now.ToLongTimeString()}: {msg}");
    }
}

为了进一步调查:您如何在程序中的15:54:20: Start 15:54:20: Starting data read task 2 15:54:25: Finished data read task 2 15:54:25: Starting data read task 0 15:54:25: Start postprocessing of task 2 15:54:28: Finished prostprocessing of task 2 15:54:28: *** Event - Processed 2 15:54:30: Finished data read task 0 15:54:30: Starting data read task 3 15:54:30: Start postprocessing of task 0 15:54:33: Finished prostprocessing of task 0 15:54:33: *** Event - Processed 0 15:54:35: Finished data read task 3 15:54:35: Start postprocessing of task 3 15:54:35: Starting data read task 1 15:54:38: Finished prostprocessing of task 3 15:54:38: *** Event - Processed 3 15:54:40: Finished data read task 1 15:54:40: Start postprocessing of task 1 15:54:43: Finished prostprocessing of task 1 15:54:43: *** Event - Processed 1 15:54:43: *** Event - Data read complete 类上实例化以及如何订阅事件?您能否更详细地描述“我所期望的”尚未“等待”评论的意思是什么?

答案 1 :(得分:0)

让我们利用现代机会:通过pipelines类和BlockingCollection类的生产者/消费者模式。

在您的GetData方法内部,启动了两个任务:一个任务用于获取数据,第二个任务用于处理数据。

您仍然可以使用事件系统。同时,将数据添加到集合中,这不会花费很长时间。

在第二个任务中,从集合中提取数据并进行处理。等待GetConsumingEnumerable方法非常有效。

class DataReader
{
    public CancellationTokenSource CTS { get; } = new CancellationTokenSource();

    internal void GetData()
    {
        // Use the desired data type instead of string
        var values = new BlockingCollection<string>();

        var readTask = Task.Factory.StartNew(() =>
        {
            try
            {
                // here your code
                for (...)
                {
                    if (CTS.Token.IsCancellationRequested)
                        break;

                    foreach (var row in rawdata)
                    {
                        DataRowReadArgs args = new DataRowReadArgs(row.Index);
                        //...
                        values.Add(args); // put value to blocking collection
                    }
                }
            }
            catch (Exception e) { /* process possible exception */}
            finally { values.CompleteAdding(); }

        }, TaskCreationOptions.LongRunning);

        var processTask = Task.Factory.StartNew(() =>
        {
            foreach (var value in values.GetConsumingEnumerable())
            {
                if (CTS.Token.IsCancellationRequested)
                    break;

                // process value
            }
        }, TaskCreationOptions.LongRunning);

        Task.WaitAll(readTask, processTask);            
    }
}

您可以随时取消任务:

var dataReader = new DataReader();
dataReader.GetData();
dataReader.CTS.Cancel();

您可以使用Task.WaitAll代替await Task.WhenAll(readTask, processTask);
在这种情况下,方法签名应如下:async Task GetDataAsync()