提高当前Linq查询的时间复杂度

时间:2016-08-24 07:56:06

标签: c# algorithm linq data-structures

我有以下列表:

RakeSnapshots, ProductMovements

目标是处理两者并获得与条件匹配的元素计数,如下所示:

  • 使用RakeSnapshots

  • 考虑StatusCode == "Dumping"
  • 使用ProductMovement

  • 考虑Status == "InProgress"
  • 获取所有元素的count个符合条件RakeSnapshots.RakeCode等于ProductMovements.ProductCode
  • 的列表

以下是我目前的选择:

//代码1:

 var resultCount =  ProductMovements.Where(x => RakeSnapshots
                                                .Where(r => r.StatusCode == "Dumping")
                                                .Any(y => y.RakeCode == x.ProductCode  && 
                                                          x.Status == "InProgress"))
                                                .Count();

//代码2:

var productMovementsInprogress = ProductMovements.Where(x => x.Status == "InProgress");

var rakeSnapShotsDumping = RakeSnapshots.Where(r => r.StatusCode == "Dumping");

var resultCount = productMovementsInprogress.Zip(rakeSnapShotsDumping,(x,y) => (y.RakeCode == x.ProductCode) ?  true : false)
                                            .Where(x => x).Count();

挑战是代码是O(n^2)复杂性,有没有办法改进它,如果数据非常大会受到伤害

2 个答案:

答案 0 :(得分:3)

听起来像Group Join(以及Join)是关联两个集合的最有效的LINQ方式:

var resultCount = ProductMovements.Where(p => p.Status == "InProgress")
    .GroupJoin(RakeSnapshots.Where(r => r.StatusCode == "Dumping"), 
        p => p.ProductCode, r => r.RakeCode, (p, match) => match)
    .Count(match => match.Any());

上述时间复杂度为O(N + M)。

答案 1 :(得分:3)

您可以使用inner join执行此操作:

var dumpingRakeSnapshots       = rakeSnapshots.Where(r => r.StatusCode == "Dumping");
var inProgressProductMovements = productMovements.Where(p => p.Status == "InProgress");

var matches =
    from r in dumpingRakeSnapshots
    join p in inProgressProductMovements on r.RakeCode equals p.ProductCode
    select r;

int count = matches.Count(); // Here's the answer.

请注意(正如Ivan Stoev指出的那样)只有在RakeCode是RakeSnapshots的主键时才有效。

如果不是,则必须使用grouped join

这里是您应该使用的Linq查询语法版本,但请注意,这与Ivan的答案完全相同(仅在Linq查询表单中):

var matches =
    from r in dumpingRakeSnapshots
    join p in inProgressProductMovements on r.RakeCode equals p.ProductCode into gj
    select gj;

为了完整起见,这是一个可编辑的控制台应用,可演示如果RakeCodeProductCode不是主键,您将获得的不同结果:

using System;
using System.Collections.Generic;
using System.Linq;

namespace ConsoleApp1
{
    class RakeSnapshot
    {
        public string StatusCode;
        public string RakeCode;
    }

    class ProductMovement
    {
        public string Status;
        public string ProductCode;
    }

    sealed class Program
    {
        void run()
        {
            var rakeSnapshots = new List<RakeSnapshot>
            {
                new RakeSnapshot {StatusCode = "Dumping", RakeCode = "1"},
                new RakeSnapshot {StatusCode = "Dumping", RakeCode = "1"},
                new RakeSnapshot {StatusCode = "Dumping", RakeCode = "2"}
            };

            var productMovements = new List<ProductMovement>
            {
                new ProductMovement {Status = "InProgress", ProductCode = "1"},
                new ProductMovement {Status = "InProgress", ProductCode = "2"},
                new ProductMovement {Status = "InProgress", ProductCode = "2"}
            };

            var dumpingRakeSnapshots       = rakeSnapshots.Where(r => r.StatusCode == "Dumping");
            var inProgressProductMovements = productMovements.Where(p => p.Status == "InProgress");

            // Inner join.

            var matches1 =
                from r in dumpingRakeSnapshots
                join p in inProgressProductMovements on r.RakeCode equals p.ProductCode
                select r;

            Console.WriteLine(matches1.Count());

            // Grouped join.

            var matches2 =
                from r in dumpingRakeSnapshots
                join p in inProgressProductMovements on r.RakeCode equals p.ProductCode into gj
                select gj;

            Console.WriteLine(matches2.Count());

            // OP's code.

            var resultCount = 
                productMovements
                .Count(x => rakeSnapshots
                .Where(r => r.StatusCode == "Dumping")
                .Any(y => y.RakeCode == x.ProductCode && x.Status == "InProgress"));

            Console.WriteLine(resultCount);
        }

        static void Main(string[] args)
        {
            new Program().run();
        }
    }
}