在信息对象列表上调用linq的最有效方法?

时间:2013-03-07 21:32:14

标签: c# performance linq extension-methods

我正在尝试从我附加它们的库对象列表中获取我的信息对象中的数据。我有两个解决方案似乎都效率很低。有没有办法将这个减少到单个OfType调用而没有linq查询是更长的变体?

using System;
using System.Collections.Generic;
using System.Linq;

namespace LinqQueries
{

    // Test the linq queries
    public class Test
    {
        public void TestIt()
        {
            List<ThirdParty> As = new List<ThirdParty>();

            // This is nearly the query I want to run, find A and C where B 
            // and C match criteria
            var cData = from a in As
                        from b in a.myObjects.OfType<MyInfo>()
                        where b.someProp == 1
                        from c in b.cs
                        where c.data == 1
                        select new {a, c};

            // This treats A and B as the same object, which is what I
            // really want, but it calls two sub-queries under the hood, 
            // which seems less efficient 
            var cDataShorter = from a in As
                               from c in a.GetCs()
                               where a.GetMyProp() == 1
                               where c.data == 1
                               select new { a, c };
        }
    }

    // library class I can't change
    public class ThirdParty
    {
        // Generic list of objects I can put my info object in
        public List<Object> myObjects;
    }

    // my info class that I add to ThirdParty
    public class MyInfo
    {
        public List<C> cs;
        public int someProp;
    }

    // My extension method for A to simplify some things.
    static public class MyExtentionOfThirdPartyClass
    {
        // Get the first MyInfo in ThirdParty
        public static MyInfo GetB(this ThirdParty a)
        {
            return (from b in a.myObjects.OfType<MyInfo>()
                    select b).FirstOrDefault();
        }

        // more hidden linq to slow things down...
        public static int GetMyProp(this ThirdParty a)
        {
            return a.GetB().someProp;
        }

        // get the list of cs with hidden linq
        public static List<C> GetCs(this ThirdParty a)
        {
            return a.GetB().cs;
        }
    }

    // fairly generic object with data in it
    public class C
    {
        public int data;
    }
}

2 个答案:

答案 0 :(得分:1)

如果您说cDataShorter正在产生正确的结果,那么您可以像这样重写它:

As.SelectMany(a => a.myObjects, (aa, mo) => new R {Tp = aa, Mi = mo as MyInfo})
  .Where(r => r.Mi != null && r.Mi.someProp == 1)
  //.Distinct(new Comparer<R>((r1, r2) => r1.Tp.Equals(r2.Tp))) 
  // If you need only one (first) MyInfo from a ThirdParty 
  // You don't need R if you're not going to use Distinct, just use an anonymous
  .SelectMany(r => r.Mi.cs, (rr, c) => new {a = rr.Tp, c})
  .Where(ao => ao.c.data == 1)      

public class R {
    public ThirdParty Tp;
    public MyInfo Mi;
}

为简单起见,Comparer来自there

答案 1 :(得分:1)

不幸的是答案是“它取决于”。我必须双向编写查询并对其进行计时运行。

1000个第三方对象,每个MyObject 1个,每个1000 c,所有结果匹配条件,第一个查询速度是其两倍。如果没有MyObjects符合条件,则查询1的速度提高两个数量级。但是,如果你有多个MyObjects,效率会反转,100 ThirdParty,每个100 MyObjects,每个100 C,所有结果匹配,第二个查询比第一个查询快两个数量级。没有MyObjects匹配,第一次更快出来。

我实际上最终实现了较慢的解决方案,因为它使代码更清晰,而较慢查询的性能并不是那么糟糕。