我正在开发应用程序,处理大量文本数据收集有关单词出现的统计信息(请参阅:Source Code Word Cloud)。
这是我的代码的简化核心。
LINQ的一切都运行良好。转向PLINQ给我带来了显着的性能提升。 但是......长时间运行的查询中的可取消性会丢失。
似乎OrderBy Query正在将数据同步回主线程并且不处理Windows消息。
在下面的考试中,我根据MSDN How to: Cancel a PLINQ Query说明了我取消的实施情况:(
还有其他想法吗?
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading;
using System.Windows.Forms;
namespace PlinqCancelability
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
m_CancellationTokenSource = new CancellationTokenSource();
}
private readonly CancellationTokenSource m_CancellationTokenSource;
private void buttonStart_Click(object sender, EventArgs e)
{
var result = Directory
.EnumerateFiles(@"c:\temp", "*.txt", SearchOption.AllDirectories)
.AsParallel()
.WithCancellation(m_CancellationTokenSource.Token)
.SelectMany(File.ReadLines)
.SelectMany(ReadWords)
.GroupBy(word => word, (word, words) => new Tuple<int, string>(words.Count(), word))
.OrderByDescending(occurrencesWordPair => occurrencesWordPair.Item1)
.Take(20);
try
{
foreach (Tuple<int, string> tuple in result)
{
Console.WriteLine(tuple);
}
}
catch (OperationCanceledException ex)
{
Console.WriteLine(ex.Message);
}
}
private void buttonCancel_Click(object sender, EventArgs e)
{
m_CancellationTokenSource.Cancel();
}
private static IEnumerable<string> ReadWords(string line)
{
StringBuilder word = new StringBuilder();
foreach (char ch in line)
{
if (char.IsLetter(ch))
{
word.Append(ch);
}
else
{
if (word.Length != 0) continue;
yield return word.ToString();
word.Clear();
}
}
}
}
}
答案 0 :(得分:3)
Jon说,你需要在后台线程上启动PLINQ操作。这样,用户界面在等待操作完成时不会挂起(因此可以调用Cancel按钮的事件处理程序并调用取消令牌的Cancel
方法)。当令牌被取消时,PLINQ查询会自动取消,因此您无需担心这一点。
以下是一种方法:
private void buttonStart_Click(object sender, EventArgs e)
{
// Starts a task that runs the operation (on background thread)
// Note: I added 'ToList' so that the result is actually evaluated
// and all results are stored in an in-memory data structure.
var task = Task.Factory.StartNew(() =>
Directory
.EnumerateFiles(@"c:\temp", "*.txt", SearchOption.AllDirectories)
.AsParallel()
.WithCancellation(m_CancellationTokenSource.Token)
.SelectMany(File.ReadLines)
.SelectMany(ReadWords)
.GroupBy(word => word, (word, words) =>
new Tuple<int, string>(words.Count(), word))
.OrderByDescending(occurrencesWordPair => occurrencesWordPair.Item1)
.Take(20).ToList(), m_CancellationTokenSource.Token);
// Specify what happens when the task completes
// Use 'this.Invoke' to specify that the operation happens on GUI thread
// (where you can safely access GUI elements of your WinForms app)
task.ContinueWith(res => {
this.Invoke(new Action(() => {
try
{
foreach (Tuple<int, string> tuple in res.Result)
{
Console.WriteLine(tuple);
}
}
catch (OperationCanceledException ex)
{
Console.WriteLine(ex.Message);
}
}));
});
}
答案 1 :(得分:1)
您目前正在迭代UI线程中的查询结果 。即使查询并行执行,您仍然会在UI线程中迭代结果。这意味着UI线程太忙于执行计算(或等待查询从其他线程获取结果)以响应“取消”按钮上的单击。
您需要将查询结果迭代到后台线程上。
答案 2 :(得分:-1)
我想我发现了一些优雅的解决方案,它更符合LINQ / PLINQ概念。
我正在声明一种扩展方法。
public static class ProcessWindowsMessagesExtension
{
public static ParallelQuery<TSource> DoEvents<TSource>(this ParallelQuery<TSource> source)
{
return source.Select(
item =>
{
Application.DoEvents();
Thread.Yield();
return item;
});
}
}
而不是将其添加到我想要响应的任何地方。
var result = Directory
.EnumerateFiles(@"c:\temp", "*.txt", SearchOption.AllDirectories)
.AsParallel()
.WithCancellation(m_CancellationTokenSource.Token)
.SelectMany(File.ReadLines)
.DoEvents()
.SelectMany(ReadWords)
.GroupBy(word => word, (word, words) => new Tuple<int, string>(words.Count(), word))
.OrderByDescending(occurrencesWordPair => occurrencesWordPair.Item1)
.Take(20);
一切正常!
请参阅我的帖子,了解更多信息和源代码:“Cancel me if you can” or PLINQ cancelability & responsiveness in WinForms