为什么双GroupBy + ToList花费的时间太长?

时间:2017-04-23 09:03:06

标签: c# list linq group-by

我有这些疑问:

    var Data = (from ftr in db.TB_FTR
                      join mst in db.TB_MST on ftr.MST_ID equals mst.MST_ID
                      join trf in db.TB_TRF on mst.TRF_ID equals trf.ID
                      select new CityCountyType { City = ftr.CITY, County = ftr.COUNTY, Type = trf.TYPE }
                      ).OrderBy(i => i.City).ThenBy(i => i.County);

var Data2 =
    Data.GroupBy(i => new {i.City, i.County, i.Type})
        .Select(group => new {Name = group.Key, Count = group.Count()})
        .OrderBy(x => x.Name)
        .ThenByDescending(x => x.Count)
        .GroupBy(g => new {g.Name.City, g.Name.County})
        .Select(g => g.Select(g2 => 
            new {Name = new {g.Key.City, g.Key.County, g2.Name.Type}, g2.Count})).ToList();

我试图获取其县和城市相同的对象列表。但是第二个查询花费的时间太长而无法给出结果。我等了大约30分钟,但没有答案,但列表Data有大约5000条记录。如何更改这些查询以便我可以获得我想要的列表列表?提前谢谢。

例如,此查询返回这样的列表:

{ Name = {{ City = New York City, County  = Bronx, Type = Type A }}, Count = 4 }

{ Name = {{ City = New York City, County  = Bronx, Type = Type B }}, Count = 8 }

{ Name = {{ City = New York City, County  = Bronx, Type = Type C }}, Count = 24 }

{ Name = {{ City = New York City, County  = Manhattan, Type = Type B }}, Count = 43 }

{ Name = {{ City = New York City, County  = Manhattan, Type = Type C }}, Count = 58 }

{ Name = {{ City = Seattle, County  = King County, Type = Type D }}, Count = 43 }

{ Name = {{ City = Seattle, County  = King County, Type = Type A }}, Count = 67 }

{ Name = {{ City = Seattle, County  =    Snohomish County, Type = Type C }}, Count = 67 }

我想将此列表设为以下几个列表:

清单1:

{ Name = {{ City = New York City, County  = Bronx, Type = Type A }}, Count = 4 }

{ Name = {{ City = New York City, County  = Bronx, Type = Type B }}, Count = 8 }

{ Name = {{ City = New York City, County  = Bronx, Type = Type C }}, Count = 24 }

清单2:

{ Name = {{ City = New York City, County  = Manhattan, Type = Type B }}, Count = 43 }

{ Name = {{ City = New York City, County  = Manhattan, Type = Type C }}, Count = 58 }

清单3:

{ Name = {{ City = Seattle, County  = King County, Type = Type D }}, Count = 43 }

{ Name = {{ City = Seattle, County  = King County, Type = Type A }}, Count = 67 }

清单4:

{ Name = {{ City = Seattle, County  =  Snohomish County, Type = Type C }}, Count = 67 }

1 个答案:

答案 0 :(得分:1)

可能性1:您的数据库没有编入索引以支持您的查询(where和join子句)。

要找到答案,请获取生成的sql并查看执行计划。如果计划说嵌套循环连接 - >聚集索引扫描,您发现了问题。

可能性2:您发现了n + 1问题。

在Linq的GROUP BY中,一个组由组密钥和组成员组成。但是,在大多数SQL实现中,GROUP BY为您提供组密钥和聚合。为了获取组的成员,将发出单独的查询。如果有n个组,则必须发出n个查询(+1是原始查询)。

要找出,请获取生成的sql。如果发出了一堆额外的查询,并且其中任何一个都说聚集索引扫描,那么您已经发现了问题。

可能性3:您实际上发出了n ^ 2(~5,000,000)个查询。

好吧,你分组了两次,所以它可能是一个双嵌套循环。查看生成的sql并查找。

所有这一切的最简单的解决方法是在分组之前将5,000条记录拉入内存。一种简单的方法是在致电ToList之前致电GroupBy