Parallel.ForEach通过输入返回顺序而不是执行到列表/字典

时间:2017-02-08 12:08:23

标签: c# multithreading plinq concurrentdictionary

如果你有这个:

var resultlist = new List<Dictionary<DateTime, double>>();
Parallel.ForEach(input, item =>
{
    resultlist.Add(SomeDataDictionary(item));
});

返回数据将按方法SomeDataDictionary返回数据的顺序排列,不会按输入顺序排列。

有没有办法保持输入顺序?

或者是更改数据类型并使用Parallel.For循环然后将索引传递给某种类型的数组返回类型的唯一方法吗?

2 个答案:

答案 0 :(得分:7)

List<T>不是线程安全的,这就是上下文中resultlist.Add 不正确的原因。我建议改用 PLinq

 var reultslist = input
   .AsParallel()
   // .AsOrdered() // uncomment this if you want to preserve input order 
   .Select(item => SomeDataDictionary(item))
   .ToList(); 

答案 1 :(得分:3)

使用 ConcurrentDictionary

的解决方案

您可以使用ConcurrentDictionary,因为它是线程安全的,您可以使用Key来存储订单。

var resultDictionary = new ConcurrentDictionary<double, Dictionary<DateTime, double>>();

// Use For-Loop index as Key
Parallel.ForEach(input, (item, state, index) => {
    resultDictionary.TryAdd(index, SomeDataDictionary(item));
});

// Convert the dictionary to a list in the required order
var resultList = resultDictionary.Keys.OrderBy(k => k).Select(k => resultDictionary[k]).ToList();

ConcurrentDictionary vs PLinq 表现

Dmitry Bychenko在单独的答案中提供了有效的PLinq解决方案。

让我们构建一个测试工具来比较解决方案:

class so42112722
{
    private readonly int[] input = Enumerable.Range(1, 5000).ToArray();

    public so42112722()
    {

    }

    public void RunTest()
    {
        var t1 = timeAction(ParallelUsingLoopStateAndDictionary);
        var t2 = timeAction(ParallelUsingPLinq);

        var diff = (t1 - t2);
        var pct = diff / (t1 > t2 ? t2 : t1);

        Console.WriteLine("| {0:0,000.000} | {1:0,000.000} | {2} is {3:0.00%} faster!", t1, t2, (diff > 0 ? "PLinq" : "ConcurrentDictionary"), Math.Abs(pct));
    }


    double timeAction(Action action)
    {
        var name = action.Method.Name;

        var tStart = DateTime.Now;

        action();

        var tEnd = DateTime.Now;
        var duration = (tEnd - tStart).TotalMilliseconds;

        return duration;
    }

    private void ParallelUsingLoopStateAndDictionary()
    {
        var resultDictionary = new ConcurrentDictionary<double, Dictionary<DateTime, double>>();

        Parallel.ForEach(input, (item, state, index) =>
        {
            resultDictionary.TryAdd(index, ExpensiveTransformation(item));
        });

        var resultList = resultDictionary.Keys.OrderBy(k => k).Select(k => resultDictionary[k]).ToList();

    }

    private void ParallelUsingPLinq()
    {
        var reultslist = input
            .AsParallel()
            .AsOrdered()
            .Select(item => ExpensiveTransformation(item))
            .ToList();
    }

    private Dictionary<DateTime, double> ExpensiveTransformation(double item)
    {
        Random rnd = new Random();
        int iterCount = 5000;

        var dict = new Dictionary<DateTime, double>();

        for (int i = 0; i < iterCount; i++)
        {
            DateTime dt = DateTime.Now.AddDays(-i * 3).AddMinutes(i).AddSeconds(item * rnd.Next(100, 1000)).AddMilliseconds(-i);

            var val = Math.Pow(item, rnd.Next(2, 5)) + rnd.Next(100, iterCount) / (i + 1);

            dict.Add(dt, val);
        }

        return dict;
    }

}

现在我们可以使用简单的控制台应用程序执行测试:

static void Main(string[] args)
{

    so42112722 test = new so42112722();

    Console.WriteLine("Comparing ConcurrentDictionary to PLinq:");

    for (int i = 0; i < 10; i++)
    {
        test.RunTest();
    }

    Console.ReadLine();
}

结果如下:

Comparing ConcurrentDictionary to PLinq:
| 7,310.756 | 7,597.217 | ConcurrentDictionary is 3.92% faster!
| 7,883.528 | 7,978.108 | ConcurrentDictionary is 1.20% faster!
| 8,075.709 | 8,072.501 | PLinq is 0.04% faster!
| 8,206.721 | 8,193.054 | PLinq is 0.17% faster!
| 8,256.499 | 8,305.187 | ConcurrentDictionary is 0.59% faster!
| 8,424.029 | 8,286.195 | PLinq is 1.66% faster!
| 8,316.973 | 8,261.499 | PLinq is 0.67% faster!
| 8,312.165 | 8,254.285 | PLinq is 0.70% faster!
| 8,328.433 | 8,369.385 | ConcurrentDictionary is 0.49% faster!
| 8,472.054 | 8,344.197 | PLinq is 1.53% faster!

(数字以毫秒为单位。)

此测试在四核Intel Core i5 CPU上执行。您的里程可能会有所不同。

PLinq在10分中的速度提高了6倍,但差异很小。总的来说,基于10次测试迭代,ConcurrentDictionary方法的速度提高了74.76毫秒(0.092%)。看起来很像United States presidential election, 2016,在那里你可以获得更多的选票但仍然输掉:)。

判决

不要试图过度优化您的代码。 .Net Framework随时为您提供帮助。如果PLinq将简化您的代码 - 使用它;另一方面,如果你需要更多控制 - 接受它。