涉及递归未完成的TPL数据流

时间:2016-05-05 15:52:53

标签: c# async-await task-parallel-library tpl-dataflow

在测试WPF项目中我试图使用TPL数据流来枚举给定父目录的所有子目录,并创建具有特定文件扩展名的文件列表,例如" .XLSX&#34 ;.我使用2个块,第一个是dirToFilesBlock,最后是fileActionBlock。

要创建遍历所有子目录的递归效果,第一个块有一个链接返回自身,链接谓词测试以查看输出项是否是目录。这是我在一本关于异步编程的书中找到的方法。第二个链接是fileActionBlock,然后根据链接谓词测试将文件添加到列表中,以查看文件是否具有正确的扩展名。

我遇到的问题是在使用btnStart_Click开启之后,它永远不会完成。也就是说,我们永远不会在事件处理程序中低于await以显示“已完成”消息。我明白我可能需要调用dirToFilesBlock.Complete(),但我不知道代码应该在哪里以及在什么条件下?我不能在初始帖子之后调用它,因为它会停止从提供子目录返回链接。我已经尝试使用InputCount和OutputCount属性,但没有走得太远。我希望,如果可能的话,保持数据流的结构,因为这意味着我还可以通过链接返回每个新目录来更新UI,以便向用户提供一些进展反馈。

我对TPL数据流非常陌生,感谢任何帮助。

以下是代码隐藏文件中的代码:

public partial class MainWindow : Window
{
    TransformManyBlock<string, string> dirToFilesBlock;
    ActionBlock<string> fileActionBlock;
    ObservableCollection<string> files;
    CancellationTokenSource cts;
    CancellationToken ct;
    public MainWindow()
    {
        InitializeComponent();

        files = new ObservableCollection<string>();

        lst.DataContext = files;

        cts = new CancellationTokenSource();
        ct = cts.Token;
    }

    private Task Start(string path)
    {
        var uiScheduler = TaskScheduler.FromCurrentSynchronizationContext();

        dirToFilesBlock = new TransformManyBlock<string, string>((Func<string, IEnumerable<string>>)(GetFileSystemItems), new ExecutionDataflowBlockOptions() { CancellationToken = ct });
        fileActionBlock = new ActionBlock<string>((Action<string>)ProcessFile, new ExecutionDataflowBlockOptions() {CancellationToken = ct, TaskScheduler = uiScheduler});

        // Order of LinkTo's important here!
        dirToFilesBlock.LinkTo(dirToFilesBlock, new DataflowLinkOptions() { PropagateCompletion = true }, IsDirectory);
        dirToFilesBlock.LinkTo(fileActionBlock, new DataflowLinkOptions() { PropagateCompletion = true }, IsRequiredDocType);

        // Kick off the recursion.
        dirToFilesBlock.Post(path);

        return Task.WhenAll(dirToFilesBlock.Completion, fileActionBlock.Completion);
    }

    private bool IsDirectory(string path)
    {

        return Directory.Exists(path);
    }


    private bool IsRequiredDocType(string fileName)
    {
        return System.IO.Path.GetExtension(fileName) == ".xlsx";
    }

    private IEnumerable<string> GetFilesInDirectory(string path)
    {
        // Check for cancellation with each new dir.
        ct.ThrowIfCancellationRequested();

        // Check in case of Dir access problems
        try
        {
            return Directory.EnumerateFileSystemEntries(path);
        }
        catch (Exception)
        {
            return Enumerable.Empty<string>();
        }
    }

    private IEnumerable<string> GetFileSystemItems(string dir)
    {
        return GetFilesInDirectory(dir);
    }

    private void ProcessFile(string fileName)
    {
        ct.ThrowIfCancellationRequested();

       files.Add(fileName);
    }

    private async void btnStart_Click(object sender, RoutedEventArgs e)
    {
        try
        {
            await Start(@"C:\");
            // Never gets here!!!
            MessageBox.Show("Completed");

        }
        catch (OperationCanceledException)
        {
            MessageBox.Show("Cancelled");

        }
        catch (Exception)
        {
            MessageBox.Show("Unknown err");
        }
        finally
        {
        }
    }

    private void btnCancel_Click(object sender, RoutedEventArgs e)
    {
        cts.Cancel();
    }
}

}

1 个答案:

答案 0 :(得分:2)

尽管这是一个老问题,但在数据流循环中处理完成仍然是一个问题。

在您的情况下,您可以让TransfomBlock保留仍在航班中的物品的数量。这表示该块正忙于处理任意数量的项目。然后,当块不忙并且它们的两个缓冲区都为空时,您只会调用Complete()。您可以在我撰写Finding Completion in a Complex Flow: Feedback Loops

的帖子中找到有关处理完成的更多信息
public partial class MainWindow : Window {

        TransformManyBlock<string, string> dirToFilesBlock;
        ActionBlock<string> fileActionBlock;
        ObservableCollection<string> files;
        CancellationTokenSource cts;
        CancellationToken ct;
        public MainWindow() {
            InitializeComponent();

            files = new ObservableCollection<string>();

            lst.DataContext = files;

            cts = new CancellationTokenSource();
            ct = cts.Token;
        }

        private async Task Start(string path) {
            var uiScheduler = TaskScheduler.FromCurrentSynchronizationContext();

            dirToFilesBlock = new TransformManyBlock<string, string>((Func<string, IEnumerable<string>>)(GetFileSystemItems), new ExecutionDataflowBlockOptions() { CancellationToken = ct });
            fileActionBlock = new ActionBlock<string>((Action<string>)ProcessFile, new ExecutionDataflowBlockOptions() { CancellationToken = ct, TaskScheduler = uiScheduler });

            // Order of LinkTo's important here!
            dirToFilesBlock.LinkTo(dirToFilesBlock, new DataflowLinkOptions() { PropagateCompletion = true }, IsDirectory);
            dirToFilesBlock.LinkTo(fileActionBlock, new DataflowLinkOptions() { PropagateCompletion = true }, IsRequiredDocType);

            // Kick off the recursion.
            dirToFilesBlock.Post(path);

            await ProcessingIsComplete();
            dirToFilesBlock.Complete();
            await Task.WhenAll(dirToFilesBlock.Completion, fileActionBlock.Completion);
        }

        private async Task ProcessingIsComplete() {
            while (!ct.IsCancellationRequested && DirectoryToFilesBlockIsIdle()) {
                await Task.Delay(500);
            }
        }

        private bool DirectoryToFilesBlockIsIdle() {
            return dirToFilesBlock.InputCount == 0 &&
                dirToFilesBlock.OutputCount == 0 &&
                directoriesBeingProcessed <= 0;
        }

        private bool IsDirectory(string path) {
            return Directory.Exists(path);
        }


        private bool IsRequiredDocType(string fileName) {
            return System.IO.Path.GetExtension(fileName) == ".xlsx";
        }

        private int directoriesBeingProcessed = 0;

        private IEnumerable<string> GetFilesInDirectory(string path) {
            Interlocked.Increment(ref directoriesBeingProcessed)
            // Check for cancellation with each new dir.
            ct.ThrowIfCancellationRequested();

            // Check in case of Dir access problems
            try {
                return Directory.EnumerateFileSystemEntries(path);
            } catch (Exception) {
                return Enumerable.Empty<string>();
            } finally {
                Interlocked.Decrement(ref directoriesBeingProcessed);
            }
        }

        private IEnumerable<string> GetFileSystemItems(string dir) {
            return GetFilesInDirectory(dir);
        }

        private void ProcessFile(string fileName) {
            ct.ThrowIfCancellationRequested();

            files.Add(fileName);
        }

        private async void btnStart_Click(object sender, RoutedEventArgs e) {
            try {
                await Start(@"C:\");
                // Never gets here!!!
                MessageBox.Show("Completed");

            } catch (OperationCanceledException) {
                MessageBox.Show("Cancelled");

            } catch (Exception) {
                MessageBox.Show("Unknown err");
            } finally {
            }
        }

        private void btnCancel_Click(object sender, RoutedEventArgs e) {
            cts.Cancel();
        }
    }