高效实现“ThenBy”排序

时间:2013-02-02 22:15:20

标签: c# performance linq sorting

我不得不编写Linq的“立即”模式实现(由于Unity / Mono的内存分配限制 - 长篇故事,并不是很重要)。

在我来ThenBy之前,我对所有表现都比真正的Linq快或快的情况都很好。很明显,我的应用方法存在缺陷,因为我的性能下降到实际交易速度的4倍。

所以我现在正在做的是 -

对于每个OrderByThenBy子句

  • 为每个选择器的结果创建一个列表,将选择器评估的所有结果添加到列表中
  • 创建一个使用默认比较器的lambda,该比较器使用从两个参数索引的列表

看起来像这样:

public static IEnumerable<T> OrderByDescending<T,TR>(this IEnumerable<T> source, Func<T,TR> clause, IComparer<TR> comparer = null)
{
    comparer = comparer ?? Comparer<TR>.Default;
    var linqList = source as LinqList<T>;
    if(linqList == null)
    {
        linqList = Recycler.New<LinqList<T>>();
        linqList.AddRange(source);
    }
    if(linqList.sorter!=null)
        throw new Exception("Use ThenBy and ThenByDescending after an OrderBy or OrderByDescending");
    var keys = Recycler.New<List<TR>>();
    keys.Capacity = keys.Capacity > linqList.Count ? keys.Capacity : linqList.Count;
    foreach(var item in source)
    {
        keys.Add(clause(item));
    }
    linqList.sorter = (x,y)=>-comparer.Compare(keys[x],keys[y]);
    return linqList;


}

public static IEnumerable<T> ThenBy<T,TR>(this IEnumerable<T> source, Func<T,TR> clause, IComparer<TR> comparer = null)
{
    comparer = comparer ?? Comparer<TR>.Default;
    var linqList = source as LinqList<T>;
    if(linqList == null || linqList.sorter==null)
    {
        throw new Exception("Use OrderBy or OrderByDescending first");
    }
    var keys = Recycler.New<List<TR>>();
    keys.Capacity = keys.Capacity > linqList.Count ? keys.Capacity : linqList.Count;
    foreach(var item in source)
    {
        keys.Add(clause(item));
    }
    linqList.sorters.Add((z,x,y)=>z != 0 ? z : comparer.Compare(keys[x],keys[y]));
    return linqList;


}

然后我在sort函数中做的是创建一个按顺序应用排序的lamda - 所以我最终得到一个看起来像Comparer<int>的函数并返回正确的顺序。

它开始这个非常糟糕的表现。我已经尝试使用currying和OrderByThenBy函数的不同签名版本,但没有什么比这更快,我想知道我是否只是错过了关于多键排序的技巧。

排序变量和函数:

    public List<Func<int,int,int,int>> sorters = new List<Func<int, int, int, int>>();
    public Func<int,int,int> sorter;
    public List<int> sortList = new List<int>();
    bool sorted;
    private List<T> myList = new List<T>();

    void ResolveSorters()
    {
        if(sorter==null)
            return;

        Func<int,int,int> function = null;

        if(sorters.Count==0)
        {
            function = sorter;
        }
        else
        {
            function = sorter;
            foreach(var s in sorters)
            {
                var inProgress = function;
                var current = s;
                function = (x,y)=>current(inProgress(x,y), x,y);
            }
        }
        sortList.Capacity = sortList.Capacity < myList.Count ? myList.Count : sortList.Capacity;
        sortList.Clear();
        sortList.AddRange(System.Linq.Enumerable.Range(0,myList.Count));
        //var c = myList.Count;
        /*for(var i =0; i < c; i++)
            sortList.Add(i);*/
        sortList.Sort(new Comparison<int>(function));
        sorted = true;
        sorters.Clear();
    }

2 个答案:

答案 0 :(得分:4)

我需要猜测,但我仍然在考虑这个问题。我认为我们应该尝试摆脱嵌套的lambda内容并委托转换。我不确定它的表现如何。 sort函数应该是这样的:

Func<int, int, int>[] sorters = ...; //fill this. it really should be an array!
Comparison<int> = (a, b) => {
 foreach (var s in sorters) {
  var cmp = s(a, b);
  if(cmp != 0) return cmp;
 }
 return 0;
};

所以我们摆脱了嵌套的调用。现在都是一个简单的循环。您可以为小循环大小构建专用版本:

Func<int, int, int>[] sorters = ...; //fill this. it really should be an array!
switch (sorters.Length) {
 case 2: {
   var s0 = sorters[0], s1 = sorters[1];
   Comparison<int> = (a, b) => {
     var cmp = s0(a, b);
     if(cmp != 0) return cmp;
     var cmp = s1(a, b);
     if(cmp != 0) return cmp;
     return 0;
   };
}

展开循环,以便在排序过程中不再出现任何数组。

所有这一切都是因为我们没有对sort函数结构的静态知识这一事实。如果比较函数只是由调用者传递的话会快得多。

更新:Repro(吞吐量比LINQ高100%)

        Process.GetCurrentProcess().PriorityClass = ProcessPriorityClass.High;

        Func<int, int, int>[] sorters = new Func<int, int, int>[]
            {
                (a, b) => (a & 0x1).CompareTo(b & 0x1),
                (a, b) => (a & 0x2).CompareTo(b & 0x2),
                (a, b) => (a & 0x4).CompareTo(b & 0x4),
                (a, b) => a.CompareTo(b),
            };

        Func<int, int, int> comparisonB = sorters[0];
        for (int i = 1; i < sorters.Length; i++)
        {
            var func1 = comparisonB;
            var func2 = sorters[i];
            comparisonB = (a, b) =>
                {
                    var cmp = func1(a, b);
                    if (cmp != 0) return cmp;
                    return func2(a, b);
                };
        }
        var comparisonC = new Comparison<int>(comparisonB);

        Comparison<int> comparisonA = (a, b) =>
        {
            foreach (var s in sorters)
            {
                var cmp = s(a, b);
                if (cmp != 0) return cmp;
            }
            return 0;
        };

        Func<int, int, int> s0 = sorters[0], s1 = sorters[1], s2 = sorters[2], s3 = sorters[3];
        Comparison<int> comparisonD = (a, b) =>
            {
                var cmp = s0(a, b);
                if (cmp != 0) return cmp;
                cmp = s1(a, b);
                if (cmp != 0) return cmp;
                cmp = s2(a, b);
                if (cmp != 0) return cmp;
                cmp = s3(a, b);
                if (cmp != 0) return cmp;
                return 0;
            };

        {
            GC.Collect();
            var data = CreateSortData();
            var sw = Stopwatch.StartNew();
            Array.Sort(data, comparisonC);
            sw.Stop();
            Console.WriteLine(sw.Elapsed.TotalSeconds);
        }

        {
            GC.Collect();
            var data = CreateSortData();
            var sw = Stopwatch.StartNew();
            Array.Sort(data, comparisonA);
            sw.Stop();
            Console.WriteLine(sw.Elapsed.TotalSeconds);
        }

        {
            GC.Collect();
            var data = CreateSortData();
            var sw = Stopwatch.StartNew();
            Array.Sort(data, comparisonD);
            sw.Stop();
            Console.WriteLine(sw.Elapsed.TotalSeconds);
        }

        {
            GC.Collect();
            var data = CreateSortData();
            var sw = Stopwatch.StartNew();
            foreach (var source in data.OrderBy(x => x & 0x1).ThenBy(x => x & 0x2).ThenBy(x => x & 0x4).ThenBy(x => x))
            {

            }
            sw.Stop();
            Console.WriteLine(sw.Elapsed.TotalSeconds);
        }

答案 1 :(得分:0)

我按[类型]排序我的项目,然后按这种方式按[价格]排序

Items = Items.OrderBy(i => i.Type).ToList();

for (var j = 0; j < Items.Count - 1; j++) // ordering ThenBy() AOT workaround
{
    for (var i = 0; i < Items.Count - 1; i++) 
    {
        if (Items[i].Type == Items[i + 1].Type && Items[i].Price > Items[i + 1].Price)
        {
            var temp = Items[i];

            Items[i] = Items[i + 1];
            Items[i + 1] = temp;
        }
    }
}