从ParallelQuery聚合自定义数据

时间:2016-01-22 18:46:58

标签: c# aggregate plinq

我从一些文件中解析了很多单词(数百万)并按语言计算它们。我使用PLINQ是因为性能,但我认为(通过观察任务管理器),整个过程是顺序的。可能被我的聚合函数阻止了。

这可能吗?

这是有罪的PLINQ

ParallelQuery<string> query = Directory.EnumerateFiles(test, "*.d", SearchOption.AllDirectories).AsParallel();
query = query.SelectMany(parseStrings).Where(isValidPhrase);
query = query.SelectMany(s => Regex.Matches(s, @"\w+").Cast<Match>().Select(match => match.Value));

Result output = query.Aggregate(new Result(), (result, word) =>
{
    if (word.All(russianAlfabet.Contains))
        result.Ru++;
    else if (czechWords.Contains(word))
        result.Cs++;
    else
        result.Other++;

    return result;
});

...这是聚合结果的类

class Result {
    public int Ru { get; set; }
    public int Cs { get; set; }
    public int Other { get; set; }
}

1 个答案:

答案 0 :(得分:1)

ParallelEnumerable.Aggregate

尝试此重载
public static TResult Aggregate<TSource, TAccumulate, TResult>(
    this ParallelQuery<TSource> source,
    TAccumulate seed,
    Func<TAccumulate, TSource, TAccumulate> updateAccumulatorFunc,
    Func<TAccumulate, TAccumulate, TAccumulate> combineAccumulatorsFunc,
    Func<TAccumulate, TResult> resultSelector
)