IO读写操作的TPL Dataflow实现中的内存问题

时间:2015-12-23 02:28:08

标签: c# multithreading file-io tpl-dataflow dataflow

我尝试使用文件IO操作实现读写操作,并将这些操作封装到Optimize#T中,以便使这些操作线程安全,而不是使用锁定机制。

但问题是,当我尝试并行编写5个文件时,存在异常内存,并且在使用此实现时它会阻止UI线程。该实现在Windows Phone项目中完成。请说明这个实现中出了什么问题。

文件IO操作

TransformBlock

MainPage.xaml.cs用法

public static readonly IsolatedStorageFile _isolatedStore = IsolatedStorageFile.GetUserStoreForApplication();
public static readonly FileIO _file = new FileIO();
public static readonly ConcurrentExclusiveSchedulerPair taskSchedulerPair = new ConcurrentExclusiveSchedulerPair();
public static readonly ExecutionDataflowBlockOptions exclusiveExecutionDataFlow 
    = new ExecutionDataflowBlockOptions
{
    TaskScheduler = taskSchedulerPair.ExclusiveScheduler,
    BoundedCapacity = 1
};

public static readonly ExecutionDataflowBlockOptions concurrentExecutionDataFlow 
    = new ExecutionDataflowBlockOptions
{
    TaskScheduler = taskSchedulerPair.ConcurrentScheduler,
    BoundedCapacity = 1
};

public static async Task<T> LoadAsync<T>(string fileName)
{
    T result = default(T);

    var transBlock = new TransformBlock<string, T>
       (async fName =>
       {
           return await LoadData<T>(fName);
       }, concurrentExecutionDataFlow);

    transBlock.Post(fileName);

    result = await transBlock.ReceiveAsync();

    return result;
}

public static async Task SaveAsync<T>(T obj, string fileName)
{
    var transBlock = new TransformBlock<Tuple<T, string>, Task>
       (async tupleData =>
       {
          await SaveData(tupleData.Item1, tupleData.Item2);
       }, exclusiveExecutionDataFlow);

    transBlock.Post(new Tuple<T, string>(obj, fileName));

    await transBlock.ReceiveAsync();
}

保存并加载数据异步

private static string data = "vjdsskjfhkjsdhvnvndjfhjvkhdfjkgd"
private static string fileName = string.Empty;
private List<string> DataLstSample = new List<string>();
private ObservableCollection<string> TestResults = new ObservableCollection<string>();
private static string data1 = "hjhkjhkhkjhjkhkhkjhkjhkhjkhjkh";
List<Task> allTsk = new List<Task>();
private Random rand = new Random();
private string  fileNameRand
{
    get
    {
        return rand.Next(100).ToString();
    }
}

public MainPage()
{
    InitializeComponent();

    for (int i = 0; i < 5; i ++)
    {
        DataLstSample.Add((i % 2) == 0 ? data : data1);
    }

}

private void Button_Click(object sender, RoutedEventArgs e)
{
    AppIsolatedStore_TestInMultiThread_LstResultShouldBeEqual();
}

public async void AppIsolatedStore_TestInMultiThread_LstResultShouldBeEqual()
{
    TstRst.Text = "InProgress..";
    allTsk.Clear();

    foreach(var data in DataLstSample)
    {
        var fName = fileNameRand;

        var t = Task.Run(async () =>
        {
            await AppIsolatedStore.SaveAsync<string>(data, fName);
        });

        TestResults.Add(string.Format("Writing file name: {0}, data: {1}", fName, data));
        allTsk.Add(t);
    }

    await Task.WhenAll(allTsk);

    TstRst.Text = "Completed..";
}

修改

它有内存异常的原因是因为我所采用的数据字符串太大的一个原因。字符串是链接:http://1drv.ms/1QWSAsc

但第二个问题是,如果我还添加小数据,那么它会阻止UI线程。代码在UI上执行任何任务吗?

2 个答案:

答案 0 :(得分:1)

不,您使用使用默认线程池的并发对来完成它的任务,并使用Run方法实例化任务,所以问题不在这里。但是你在这里的代码有两个主要威胁:

var transBlock = new TransformBlock<string, T>
   (async fName =>
   {
       // process file here
   }, concurrentExecutionDataFlow);

你真的不应该每次都创造transBlockTPL Dataflow的主要思想是您创建块一次并在此之后使用它们。所以你应该重构你的应用程序以减少你实例化的块数,否则不应该使用TPL Dataflow

您的代码中的另一个威胁是您明确阻止该线程!

// Right here
await Task.WhenAll(allTsk);
TstRst.Text = "Completed..";

从同步事件处理程序中的await方法调用async void任务会阻塞该线程,默认情况下为it captures the synchronization context。首先,async void should be avoided。其次,如果你是异步的,那么should be async all the way,所以事件处理程序也应该是异步的。第三,您可以使用continuation for your task更新您的用户界面或use current synchronization context

所以,你的代码应该是这样的:

// store the sync context in the field of your form
SynchronizationContext syncContext = SynchronizationContext.Current;

// avoid the async void :)
public async Task AppIsolatedStore_TestInMultiThread_LstResultShouldBeEqual()

// make event handler async - this is the only exception for the async void use rule from above
private async void Button_Click(object sender, RoutedEventArgs e)

// asynchronically wait the result without capturing the context
await Task.WhenAll(allTsk).ContinueWith(
  t => {
    // you can move out this logic to main method
    syncContext.Post(new SendOrPostCallback(o =>
        {
            TstRst.Text = "Completed..";
        }));
  }
);

答案 1 :(得分:0)

您是否尝试过使用ExecutionDataflowBlockOptions上的BoundedCapacity参数? Introduction to TPL提到了块容量:

  

[...]边界在数据流网络中很有用,可以避免无限制的内存   生长。如果存在可靠性原因,这可能非常重要   生产者最终可能更快地生成数据的可能性   消费者可以处理它......

我建议尝试使用此选项,这会限制已处理项目的排队,并查看它是否有助于解决您的内存问题