我在尝试优化一些旧代码时遇到了一些问题。总体情况是这样的:有一个“导出引擎”,它根据所需的输出旋转一些“ writer”对象。写入操作启动DataReader
对象并预订其事件,以便它可以处理正在读取的数据。然后,它将在阅读器中启动长期运行的“ GetData”方法。这将从旧数据库中检索数据,这需要很长时间(!)。数据读取器处理返回的值并触发多个事件,这些事件使写入器可以处理数据。
下面提供了一个非常简化的DataReader伪代码示例。
class DataReader
{
// delegates
internal delegate void DataRowReadHandler(object sender, DataRowReadArgs e);
internal delegate void DataProgressChangedHandler(object sender, DataProgressChangedArgs e);
internal delegate void DataReadCompleteHandler(object sender, DataReadCompleteArgs e);
// events
internal event DataProgressChangedHandler DataProgressChanged;
internal event DataReadCompleteHandler DataReadCompleted;
internal event DataRowReadHandler DataRowRead;
// this methods chomps on and on and raises an event when the database read returns something
internal void GetData()
{
for (int totalrows = 0; totalrows < _cursor.RowCount; totalrows += _maxrows)
{
// I want to keep GetRawData running while the data it fetched is being processed
string[][] rawdata = _cursor.GetRawData(_maxrows);
// -- a ton of post-processing I want to do while database is being read--
// and then report progress
foreach (row in rawdata)
{
DataRowReadArgs args = new DataRowReadArgs(row.Index)
OnDataRowRead(args); // raise event after each row
}
DataProgressChangedArgs args = new DataProgressChangedArgs(batch, counter);
OnDataProgressChanged(args); // raise event after each batch of rows
}
// report we're done
DataReadCompleteArgs e = new DataReadCompleteArgs(counter);
OnDataReadCompleted(e); // done with reading data
}
protected virtual void OnDataProgressChanged(DataProgressChangedArgs e)
{
DataProgressChangedHandler handler = DataProgressChanged;
if (handler != null)
handler(this, e);
}
protected virtual void OnDataReadCompleted(DataReadCompleteArgs e)
{
DataReadCompleteHandler handler = DataReadCompleted;
if (handler != null)
handler(this, e);
}
protected virtual void OnDataRowRead(DataRowReadArgs e)
{
DataRowReadHandler handler = DataRowReadRead;
if (handler != null)
handler(this, e);
}
}
我想要的是:保持数据库读取(这将是最慢的)运行,并在查询结果可用时处理返回的数据。也就是说:在读取器中对数据进行后处理,触发事件,并在继续读取数据库的同时让写入器中的处理程序处理它们。理想情况下,我还希望一些取消标记在出现问题时停止读取,但首先要先进行。我不想触及许多类所依赖的基于事件的系统,我只希望读取的数据库并行运行,并让其余代码在有结果时做出响应。
我已经花了将近一个星期的时间来尝试使用await / async和TaskCompletionSource以及其他东西,但是似乎仍然无法解决这个问题。我已经接近了,实际上我设法编译了一个任务列表,将其提供给一个中间方法,该方法将在完成每个任务时对其进行处理,然后等待。
internal async Task GetDataAsync()
{
IList<Task<string[][]>> tasks = CreateCursorReadTasks();
var processingTasks = tasks.Select(AwaitAndProcessAsync).ToList();
await Task.WhenAll(processingTasks);
// this isn't 'awaited' in the sense I expected
// also, what order are they performed in? The database is single-threaded, no queues, nothing
// I need to fire my 'done' event only after all tasks have finished
}
private IList<Task<string[][]>> CreateCursorReadTasks()
{
IList<Task<string[][]>> retval = new List<Task<string[][]>>();
for (int totalrows = 0; totalrows < this._cursor.RowCount; totalrows += _maxrows)
{
retval.Add(Task.Run(() => _cursor.GetRawData(_maxrows)));
}
return retval;
}
internal async Task AwaitAndProcessAsync(Task<string[][]> task)
{
string[][] rawdata = await task;
// Do all the post-processing and fire the events like in the GetData method of DataReader
}
除了所有这些看似过于复杂之外,我还遇到了两个问题:a)即使我订阅了它们,我的事件处理程序似乎都为空; b)我不在哪里/如何引发已完成的事件。
我的问题是:当您在GetData
类中查看我的DataReader
方法时,您如何建议我让非常昂贵的数据库调用异步运行?
答案 0 :(得分:0)
您的伪代码似乎没问题,我尝试使用一个程序进行验证,其中我使用checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /usr/bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... no
checking whether make supports nested variables... yes
checking for perl... /usr/bin/perl
checking for perl >= 5.8.1... 5.26.2
checking for XML::Parser... configure: error: XML::Parser perl module is required for intltool
模拟了对数据库的调用,并且随时仅允许访问一项任务(以考虑数据库的事实是单线程)。
Task.Delay(5000)
输出表明它可以正确运行。在示例中,启动了4个任务,每次访问数据库需要5秒钟,而后处理则需要3秒钟(很明显)。总运行时间约为23秒(4 * 5 + 3),这意味着后期处理与数据库读取并行执行。事件也按预期触发。每次运行程序时,任务执行的顺序都不同。请参见以下程序输出:
class Program
{
public static async Task Main(string[] args)
{
var dataReader = new DataReader();
dataReader.DataProgressChanged += (s, e) => Log.D($"*** Event - Processed {e.TaskId}");
dataReader.DataReadCompleted += (s, e) => Log.D("*** Event - Data read complete");
await dataReader.GetDataAsync();
Console.ReadKey();
}
}
public class DataReader
{
internal delegate void DataProgressChangedHandler(object sender, DataProgressChangedArgs e);
internal delegate void DataReadCompleteHandler(object sender, DataReadCompleteArgs e);
internal event DataProgressChangedHandler DataProgressChanged;
internal event DataReadCompleteHandler DataReadCompleted;
private SemaphoreSlim semaphore = new SemaphoreSlim(1);
internal async Task GetDataAsync()
{
Log.D("Start");
var tasks = CreateCursorReadTasks();
var processingTasks = tasks.Select(AwaitAndProcessAsync).ToList();
await Task.WhenAll(processingTasks);
OnDataReadCompleted(new DataReadCompleteArgs());
}
private IList<ReadTaskWrapper> CreateCursorReadTasks()
{
var retval = new List<ReadTaskWrapper>();
for (int totalrows = 0; totalrows < 4; totalrows++)
{
int taskId = totalrows;
retval.Add(new ReadTaskWrapper
{
Task = Task.Run(async () => { return await SimulateDbReadAsync(taskId); }),
Id = taskId
});
}
return retval;
}
private async Task<string[][]> SimulateDbReadAsync(int taskId)
{
await semaphore.WaitAsync();
Log.D($"Starting data read task {taskId}");
await Task.Delay(5000);
Log.D($"Finished data read task {taskId}");
semaphore.Release();
return new string[1][];
}
internal async Task AwaitAndProcessAsync(ReadTaskWrapper task)
{
string[][] rawdata = await task.Task;
Log.D($"Start postprocessing of task {task.Id}");
await Task.Delay(3000);
Log.D($"Finished prostprocessing of task {task.Id}");
OnDataProgressChanged(new DataProgressChangedArgs { TaskId = task.Id });
}
internal void OnDataProgressChanged(DataProgressChangedArgs args)
{
DataProgressChanged?.Invoke(this, args);
}
internal void OnDataReadCompleted(DataReadCompleteArgs args)
{
DataReadCompleted?.Invoke(this, args);
}
internal class DataProgressChangedArgs : EventArgs
{
public int TaskId { get; set; }
}
internal class DataReadCompleteArgs : EventArgs
{
}
}
public class ReadTaskWrapper
{
public int Id { get; set; }
public Task<string[][]> Task { get; set; }
}
public static class Log
{
public static void D(string msg)
{
Console.WriteLine($"{DateTime.Now.ToLongTimeString()}: {msg}");
}
}
为了进一步调查:您如何在程序中的15:54:20: Start
15:54:20: Starting data read task 2
15:54:25: Finished data read task 2
15:54:25: Starting data read task 0
15:54:25: Start postprocessing of task 2
15:54:28: Finished prostprocessing of task 2
15:54:28: *** Event - Processed 2
15:54:30: Finished data read task 0
15:54:30: Starting data read task 3
15:54:30: Start postprocessing of task 0
15:54:33: Finished prostprocessing of task 0
15:54:33: *** Event - Processed 0
15:54:35: Finished data read task 3
15:54:35: Start postprocessing of task 3
15:54:35: Starting data read task 1
15:54:38: Finished prostprocessing of task 3
15:54:38: *** Event - Processed 3
15:54:40: Finished data read task 1
15:54:40: Start postprocessing of task 1
15:54:43: Finished prostprocessing of task 1
15:54:43: *** Event - Processed 1
15:54:43: *** Event - Data read complete
类上实例化以及如何订阅事件?您能否更详细地描述“我所期望的”尚未“等待”评论的意思是什么?
答案 1 :(得分:0)
让我们利用现代机会:通过pipelines类和BlockingCollection类的生产者/消费者模式。
在您的GetData
方法内部,启动了两个任务:一个任务用于获取数据,第二个任务用于处理数据。
您仍然可以使用事件系统。同时,将数据添加到集合中,这不会花费很长时间。
在第二个任务中,从集合中提取数据并进行处理。等待GetConsumingEnumerable
方法非常有效。
class DataReader
{
public CancellationTokenSource CTS { get; } = new CancellationTokenSource();
internal void GetData()
{
// Use the desired data type instead of string
var values = new BlockingCollection<string>();
var readTask = Task.Factory.StartNew(() =>
{
try
{
// here your code
for (...)
{
if (CTS.Token.IsCancellationRequested)
break;
foreach (var row in rawdata)
{
DataRowReadArgs args = new DataRowReadArgs(row.Index);
//...
values.Add(args); // put value to blocking collection
}
}
}
catch (Exception e) { /* process possible exception */}
finally { values.CompleteAdding(); }
}, TaskCreationOptions.LongRunning);
var processTask = Task.Factory.StartNew(() =>
{
foreach (var value in values.GetConsumingEnumerable())
{
if (CTS.Token.IsCancellationRequested)
break;
// process value
}
}, TaskCreationOptions.LongRunning);
Task.WaitAll(readTask, processTask);
}
}
您可以随时取消任务:
var dataReader = new DataReader();
dataReader.GetData();
dataReader.CTS.Cancel();
您可以使用Task.WaitAll
代替await Task.WhenAll(readTask, processTask);
在这种情况下,方法签名应如下:async Task GetDataAsync()