比较两个不同长度的整数数组时,提高性能

时间:2013-02-27 16:59:47

标签: c# performance algorithm

我有一个瓶颈(或者至少是我认为我可以做得更好的一个区域)这个比较器,它基本上是一个序数字符串比较器,但对整数(ushort,虽然我认为这不重要)数组。

数组的长度可以不同,但​​长度只有在元素[0..n]的情况下才有意义,其中n是最短数组匹配的长度。在这种情况下,较长的数组被认为是“更大”。

1 2 3< 1 2 4

1 2 3 5< 1 2 4

1 2 5 3> 1 2 4

1 2 3 4> 1 2 3

    public int Compare(ushort[] x, ushort[] y)
    {
        int pos = 0;
        int len = Math.Min(x.Length, y.Length);
        while (pos < len && x[pos] == y[pos])
            pos++;

        return pos < len ?
            x[pos].CompareTo(y[pos]) :
            x.Length.CompareTo(y.Length);

    }

有关如何优化这一点的任何想法?

更新

回应一位关于我在这里做了什么的评论者:我意识到我在很久以前就提出了一个与此相关的问题,这正好说明了我在上下文中所做的事情。唯一的主要区别是我现在使用ushort数组而不是键的字符串,因为它更紧凑。

使用整个路径作为键,我可以使用部分键从排序集中获取视图,从而为子集查询提供高性能。我正在尝试在构建索引时提高性能。 Data structure for indexed searches of subsets

顺便说一句,到目前为止,我对这里的回答印象非常深刻,多年来我已经问了很多关于SO的问题,但这是我见过的最有思想和最有趣的答案。我不确定我的具体问题是什么正确的答案(这是数百万个短阵列的比较),但每个人都教过我一些我不知道的事情。

4 个答案:

答案 0 :(得分:3)

以下是我提出的内容,我使用您的代码和一些并行代码的组合测试了大约16 mil(2 ^ 24)。

public int CompareParallel(ushort[]x, ushort[] y, int len, int segLen)
{
    int compareArrLen = ( len / segLen ) + 1;
    int [ ] compareArr = new int [ compareArrLen ];
    Parallel.For ( 0 , compareArrLen , 
                   new Action<int , ParallelLoopState> ( ( i , state ) =>
    {
        if ( state.LowestBreakIteration.HasValue 
                 && state.LowestBreakIteration.Value < i )
            return;
        int segEnd = ( i + 1 ) * segLen;
        int k = len < segEnd ? len : segEnd;
        for ( int j = i * segLen ; j < k ; j++ )
            if ( x [ j ] != y [ j ] )
            {
                compareArr [ i ] = ( x [ j ].CompareTo ( y [ j ] ) );
                state.Break ( );
                return;
            }
    } ) );
    int r = compareArrLen - 1;
    while ( r >= 0 )
    {
        if ( compareArr [ r ] != 0 )
            return compareArr [ r ];
        r--;
    }
    return x.Length.CompareTo ( y.Length );
}

public int CompareSequential ( ushort [ ] x , ushort [ ] y, int len )
{
    int pos = 0;
    while ( pos < len && x [ pos ] == y [ pos ] )
        pos++;

    return pos < len ?
        x [ pos ].CompareTo ( y [ pos ] ) :
        x.Length.CompareTo ( y.Length );

}

public int Compare( ushort [ ] x , ushort [ ] y ) 
{
    //determined through testing to be the best on my machine
    const int cutOff = 4096;
    int len = x.Length < y.Length ? x.Length : y.Length;
    //check if len is above a specific threshold 
    //and if first and a number in the middle are equal
    //chose equal because we know that there is a chance that more
    //then 50% of the list is equal, which would make the overhead
    //worth the effort
    if ( len > cutOff && x [ len - 1 ] == y [ len - 1 ] 
           && x [ len/2 ] == y [ len/2 ] )
    {
        //segment length was determined to be best through testing
        //at around 8% of the size of the array seemed to have the
        //on my machine
        return CompareParallel ( x , y , len , (len / 100)*8 );
    }
    return CompareSequential ( x , y, len );
}

这是我写的测试:

class Program
{
    [Flags]
    private enum InfoLevel:byte
    {
        Detail=0x01, Summary=0x02
    }

    private static InfoLevel logLevel = InfoLevel.Summary;

    private static void LogDetail ( string content ) 
    {
        LogInfo ( InfoLevel.Detail,content );
    }

    private static void LogSummary ( string content ) 
    {
        LogInfo ( InfoLevel.Summary , content );
    }

    private static void LogInfo ( InfoLevel level , string content ) 
    {
        if ( ( level & logLevel ) == level )
            Console.WriteLine ( content );
    }

    private static void LogInfo ( InfoLevel level , string format, 
                                  params object[] arg )
    {
        if ( ( level & logLevel ) == level )
            Console.WriteLine ( format:format, arg:arg  );
    }

    private static void LogDetail ( string format , params object [ ] arg )
    {
        LogInfo ( InfoLevel.Detail , format, arg );
    }

    private static void LogSummary ( string format , params object [ ] arg )
    {
        LogInfo ( InfoLevel.Summary , format, arg );
    }

    const string _randTestResultHeader = "\r\nRandom Array Content\r\n";
    const string _equalArrayResultHeader = "Only Length Different\r\n\r\n";
    const string _summaryTestResultsHeader = 
                                "Size\t\tOrig Elps\tPara Elps\tComp Elps\r\n";
    const string _summaryBodyContent = 
                         "{0}\t\t{1:0.0000}\t\t{2:0.0000}\t\t{3:0.00000}\r\n";

    static void Main ( string [ ] args )
    {
        Console.SetOut(new StreamWriter(File.Create("out.txt")));

        int segLen = 0;
        int segPercent = 7;
        Console.WriteLine ( "Algorithm Test, Time results in milliseconds" );
        for ( ; segPercent < 13; segPercent ++ )
        {
            Console.WriteLine ( 
                      "Test Run with parallel Dynamic segment size at {0}%"
                       +" of Array Size (Comp always at 8%)\r\n" , segPercent);

            StringBuilder _aggrRandResults = new StringBuilder ( );
            StringBuilder _aggrEqualResults = new StringBuilder ( );

            _aggrRandResults.Append ( _randTestResultHeader );
            _aggrEqualResults.Append ( _equalArrayResultHeader );

            _aggrEqualResults.Append ( _summaryTestResultsHeader );
            _aggrRandResults.Append ( _summaryTestResultsHeader );


            for ( int i = 10 ; i < 25 ; i++ )
            {
                int baseLen = ( int ) Math.Pow ( 2 , i );
                segLen = ( baseLen / 100 ) * segPercent;

                var testName = "Equal Length ";
                var equalTestAverage = RandomRunTest ( testName , baseLen , 
                                                       baseLen, segLen );
                testName = "Left Side Larger";
                var lslargerTestAverage=RandomRunTest(testName,baseLen+10, 
                                                      baseLen, segLen );
                testName = "Right Side Larger";
                var rslargerTestAverage = RandomRunTest ( testName , baseLen ,
                                                        baseLen + 10, segLen );

                double [ ] completelyRandomTestAvg = new double [ 3 ];
                for ( int l = 0 ; l < completelyRandomTestAvg.Length ; l++ )
                    completelyRandomTestAvg [ l ] = ( equalTestAverage [ l ] +
                                                 lslargerTestAverage [ l ] +
                                              rslargerTestAverage [ l ] ) / 3;

                LogDetail ( "\r\nRandom Test Results:" );
                LogDetail ("Original Composite Test Average: {0}" ,
                           completelyRandomTestAvg [ 0 ] );
                LogDetail ( "Parallel Composite Test Average: {0}" ,
                            completelyRandomTestAvg [ 1 ]  );

                _aggrRandResults.AppendFormat ( _summaryBodyContent , 
                    baseLen , 
                    completelyRandomTestAvg [ 0 ] , 
                    completelyRandomTestAvg [ 1 ] , 
                    completelyRandomTestAvg [ 2 ]);

                testName = "Equal Len And Values";
                var equalEqualTest = EqualTill ( testName , baseLen , 
                                                 baseLen, segLen );

                testName = "LHS Larger";
                var equalLHSLargerTest = EqualTill ( testName , baseLen + 10 , 
                                                     baseLen, segLen );

                testName = "RHS Larger";
                var equalRHSLargerTest = EqualTill ( testName , baseLen , 
                                                     baseLen + 10, segLen );

                double [ ] mostlyEqualTestAvg = new double [ 3 ];
                for ( int l = 0 ; l < mostlyEqualTestAvg.Length ; l++ )
                    mostlyEqualTestAvg [ l ] = ( ( equalEqualTest [ l ] +
                                            equalLHSLargerTest [ l ] +
                                            equalRHSLargerTest [ l ] ) / 3 );

                LogDetail( "\r\nLength Different Test Results" );
                LogDetail( "Original Composite Test Average: {0}" , 
                           mostlyEqualTestAvg [ 0 ] );
                LogDetail( "Parallel Composite Test Average: {0}" , 
                            mostlyEqualTestAvg [ 1 ] );

                _aggrEqualResults.AppendFormat ( _summaryBodyContent , 
                                                 baseLen , 
                                                 mostlyEqualTestAvg [ 0 ] , 
                                                 mostlyEqualTestAvg [ 1 ] ,
                                                 mostlyEqualTestAvg [ 2 ]);
            }

            LogSummary ( _aggrRandResults.ToString() + "\r\n");
            LogSummary ( _aggrEqualResults.ToString()+ "\r\n");

        }
        Console.Out.Flush ( );
    }


    private const string _testBody = 
                  "\r\n\tOriginal:: Result:{0}, Elapsed:{1}"
                 +"\r\n\tParallel:: Result:{2}, Elapsed:{3}"
                 +"\r\n\tComposite:: Result:{4}, Elapsed:{5}";
    private const string _testHeader = 
                  "\r\nTesting {0}, Array Lengths: {1}, {2}";
    public static double[] RandomRunTest(string testName, int shortArr1Len, 
                                         int shortArr2Len, int parallelSegLen)
    {

        var shortArr1 = new ushort [ shortArr1Len ];
        var shortArr2 = new ushort [ shortArr2Len ];
        double [ ] avgTimes = new double [ 3 ];

        LogDetail ( _testHeader , testName , shortArr1Len , shortArr2Len ) ;
        for ( int i = 0 ; i < 10 ; i++ )
        {
            int arrlen1 = shortArr1.Length , arrlen2 = shortArr2.Length;

            double[] currResults = new double [ 3 ];

            FillCompareArray ( shortArr1 , shortArr1.Length );
            FillCompareArray ( shortArr2 , shortArr2.Length );

            var sw = new Stopwatch ( );

            //Force Garbage Collection 
            //to avoid having it effect 
            //the test results this way 
            //test 2 may have to garbage 
            //collect due to running second
            GC.Collect ( );
            sw.Start ( );
            int origResult = Compare ( shortArr1 , shortArr2 );
            sw.Stop ( );
            currResults[0] = sw.Elapsed.TotalMilliseconds;
            sw.Reset ( );

            GC.Collect ( );
            sw.Start ( );
            int parallelResult = CompareParallelOnly ( shortArr1 , shortArr2, 
                                                       parallelSegLen );
            sw.Stop ( );
            currResults [ 1 ] = sw.Elapsed.TotalMilliseconds;
            sw.Reset ( );

            GC.Collect ( );
            sw.Start ( );
            int compositeResults = CompareComposite ( shortArr1 , shortArr2 );
            sw.Stop ( );                
            currResults [ 2 ] = sw.Elapsed.TotalMilliseconds;

            LogDetail ( _testBody, origResult , currResults[0] , 
                        parallelResult , currResults[1], 
                        compositeResults, currResults[2]);

            for ( int l = 0 ; l < currResults.Length ; l++ )
                avgTimes [ l ] = ( ( avgTimes[l]*i)+currResults[l]) 
                                    / ( i + 1 );
        }
        LogDetail ( "\r\nAverage Run Time Original: {0}" , avgTimes[0]);
        LogDetail ( "Average Run Time Parallel: {0}" , avgTimes[1]);
        LogDetail ( "Average Run Time Composite: {0}" , avgTimes [ 2 ] );

        return avgTimes;
    }

    public static double [ ] EqualTill ( string testName, int shortArr1Len , 
                                       int shortArr2Len, int parallelSegLen)
    {

        const string _testHeader = 
               "\r\nTesting When Array Difference is "
               +"Only Length({0}), Array Lengths: {1}, {2}";

        int baseLen = shortArr1Len > shortArr2Len 
                          ? shortArr2Len : shortArr1Len;

        var shortArr1 = new ushort [ shortArr1Len ];
        var shortArr2 = new ushort [ shortArr2Len ];
        double [ ] avgTimes = new double [ 3 ];

        LogDetail( _testHeader , testName , shortArr1Len , shortArr2Len );
        for ( int i = 0 ; i < 10 ; i++ )
        {

            FillCompareArray ( shortArr1 , shortArr1Len);
            Array.Copy ( shortArr1 , shortArr2, baseLen );
            double [ ] currElapsedTime = new double [ 3 ];
            var sw = new Stopwatch ( );
            //See previous explaination 
            GC.Collect ( );
            sw.Start ( );
            int origResult = Compare ( shortArr1 , shortArr2 );
            sw.Stop ( );
            currElapsedTime[0] = sw.Elapsed.TotalMilliseconds;
            sw.Reset ( );

            GC.Collect ( );
            sw.Start ( );
            int parallelResult = CompareParallelOnly ( shortArr1, shortArr2, 
                                     parallelSegLen );
            sw.Stop ( );
            currElapsedTime[1] = sw.Elapsed.TotalMilliseconds;
            sw.Reset ( );

            GC.Collect ( );
            sw.Start ( );
            var compositeResult = CompareComposite ( shortArr1 , shortArr2 );
            sw.Stop ( );
            currElapsedTime [ 2 ] = sw.Elapsed.TotalMilliseconds;

            LogDetail ( _testBody , origResult , currElapsedTime[0] , 
                parallelResult , currElapsedTime[1], 
                compositeResult,currElapsedTime[2]);

            for ( int l = 0 ; l < currElapsedTime.Length ; l++ )
                avgTimes [ l ] = ( ( avgTimes [ l ] * i ) 
                                   + currElapsedTime[l])/(i + 1);
        }
        LogDetail ( "\r\nAverage Run Time Original: {0}" , avgTimes [ 0 ] );
        LogDetail ( "Average Run Time Parallel: {0}" , avgTimes [ 1 ] );
        LogDetail ( "Average Run Time Composite: {0}" , avgTimes [ 2 ] );
        return avgTimes;
    }


    static Random rand = new Random ( );
    public static void FillCompareArray ( ushort[] compareArray, int length ) 
    {
        var retVals = new byte[length];
        ( rand ).NextBytes ( retVals );
        Array.Copy ( retVals , compareArray , length);
    }

    public static int CompareParallelOnly ( ushort [ ] x , ushort[] y, 
                                            int segLen ) 
    {
       int len = x.Length<y.Length ? x.Length:y.Length;
       int compareArrLen = (len/segLen)+1;
       int[] compareArr = new int [ compareArrLen ];
       Parallel.For ( 0 , compareArrLen , 
           new Action<int , ParallelLoopState> ( ( i , state ) =>
       {
           if ( state.LowestBreakIteration.HasValue 
                    && state.LowestBreakIteration.Value < i )
               return;
           int segEnd = ( i + 1 ) * segLen;
           int k = len<segEnd?len:segEnd;

           for ( int j = i * segLen ; j < k ; j++ )
               if ( x [ j ] != y [ j ] )
               {
                   compareArr [ i ] = ( x [ j ].CompareTo ( y [ j ] ) );
                   state.Break ( );
                   return;
               }
       } ) );
       int r=compareArrLen-1;
       while ( r >= 0 ) 
       {
           if ( compareArr [ r ] != 0 )
               return compareArr [ r ];
           r--;
       }
       return x.Length.CompareTo ( y.Length );
    }

    public static int Compare ( ushort [ ] x , ushort [ ] y )
    {
        int pos = 0;
        int len = Math.Min ( x.Length , y.Length );
        while ( pos < len && x [ pos ] == y [ pos ] )
            pos++;

        return pos < len ?
            x [ pos ].CompareTo ( y [ pos ] ) :
            x.Length.CompareTo ( y.Length );

    }

    public static int CompareParallel ( ushort[] x, ushort[] y, int len, 
                                        int segLen )
    {
        int compareArrLen = ( len / segLen ) + 1;
        int [ ] compareArr = new int [ compareArrLen ];
        Parallel.For ( 0 , compareArrLen , 
            new Action<int , ParallelLoopState> ( ( i , state ) =>
        {
            if ( state.LowestBreakIteration.HasValue 
                 && state.LowestBreakIteration.Value < i )
                return;
            int segEnd = ( i + 1 ) * segLen;
            int k = len < segEnd ? len : segEnd;
            for ( int j = i * segLen ; j < k ; j++ )
                if ( x [ j ] != y [ j ] )
                {
                    compareArr [ i ] = ( x [ j ].CompareTo ( y [ j ] ) );
                    state.Break ( );
                    return;
                }
        } ) );
        int r = compareArrLen - 1;
        while ( r >= 0 )
        {
            if ( compareArr [ r ] != 0 )
                return compareArr [ r ];
            r--;
        }
        return x.Length.CompareTo ( y.Length );
    }

    public static int CompareSequential(ushort [ ] x , ushort [ ] y, int len)
    {
        int pos = 0;
        while ( pos < len && x [ pos ] == y [ pos ] )
            pos++;

        return pos < len ?
            x [ pos ].CompareTo ( y [ pos ] ) :
            x.Length.CompareTo ( y.Length );

    }

    public static int CompareComposite ( ushort [ ] x , ushort [ ] y ) 
    {
        const int cutOff = 4096;
        int len = x.Length < y.Length ? x.Length : y.Length;

        if ( len > cutOff && x [ len - 1 ] == y [ len - 1 ]
                 && x [ len/2 ] == y [ len/2 ] )
            return CompareParallel ( x , y , len , (len / 100)*8 );

        return CompareSequential ( x , y, len );
    }
}

注意:

确保使用优化的代码构建,当我没有包含这一步时结果非常不同,它使得并行代码看起来像实际上是一个更大的改进。

我得到的结果是,对于很长的相同数字组,执行时间减少了大约33%。它仍然随着输入的增加而线性增长,但速度较慢。对于小型数据集(在我的机器上小于4092),它也开始变慢,但通常所花费的时间足够小(在我的机器上为.001毫秒),如果你得到它将值得使用它一个大致几乎相等的阵列。

答案 1 :(得分:1)

可能不会产生很大的不同,但你可以将最后一个元素设置为不同,以摆脱while循环中的pos < len检查。相当简单的pos++++pos

public int Compare(ushort[] x, ushort[] y)
{
    int pos = 0;
    int len = Math.Min(x.Length, y.Length);

    // the below is probably not worth it for less than 5 (or so) elements,
    //   so just do the old way
    if (len < 5)
    {
      while (pos < len && x[pos] == y[pos])
        ++pos;

      return pos < len ?
        x[pos].CompareTo(y[pos]) :
        x.Length.CompareTo(y.Length);
    }

    ushort lastX = x[len-1];
    bool lastSame = true;
    if (x[len-1] == y[len-1])
        --x[len-1]; // can be anything else
    else
        lastSame = false;

    while (x[pos] == y[pos])
        ++pos;

    return pos < len-1 ?
        x[pos].CompareTo(y[pos]) :
        lastSame ? x.Length.CompareTo(y.Length)
                 : lastX.CompareTo(y[len-1]);
}

编辑:只有从一开始的许多元素都是相同的(并且如果存在早期差异,如pkuderov所提到的那样,情况会更糟),你才会真正获得性能提升。 / p>

答案 2 :(得分:1)

很抱歉答案很长,但问题让我如此感兴趣,我花了几个小时调查,我想分享结果。 我写了一些测试用例生成器和粗略的性能测试器

那是什么:

  1. 生成完全随机的数组
  2. 检查3种比较方法的执行时间
  3. 生成具有高相似概率的数组
  4. 检查执行时间。
  5. 我使用了3种方法

    1. OP的
    2. My - Idea - 将两个索引操作更改为指针增量
    3. Dukeling's - Idea - 删除不必要的比较
    4. 我开始使用短阵列(长度为5-15)

      方法1在两种测试变异中都是最快的(由pkuderov预测)

      如果我们增加数组长度的变化。

      当阵列长度介于500和1500之间时,我得到了这个

      Generating test cases ...
      Done. (5258 milliseconds)
      Compare1 took 18 milliseconds
      Compare2 took 18 milliseconds
      Compare3 took 33 milliseconds
      Generating 'similar' test cases ...
      Done. (5081 milliseconds)
      Compare1 took 359 milliseconds
      Compare2 took 313 milliseconds
      Compare3 took 295 milliseconds
      

      因此,与1相比,方法2略有增加,与2相比,方法3的增益更小;

      解决:

      <强> 1。如果您的阵列足够短和/或很有可能     差异从小指数值开始 - 你可以做的并不多     做(用建议的方法)
       2.否则你可以尝试一些         方法2和3的组合。

      守则:

      using System;
      using System.Diagnostics;
      
      namespace ConsoleExamples
          {
              class ArrayComparePerformance
              {
                  static readonly int testArraysNum = 100000;
                  static readonly int maxArrayLen = 1500;
                  static readonly int minArrayLen = 500;
                  static readonly int maxValue = 10;
      
                  public static void RunTest()
                  {
                      //Generate random arrays;
                      ushort[][] a = new ushort[testArraysNum][];
                      ushort[][] b = new ushort[testArraysNum][];
      
                      Random rand = new Random();
      
                      Console.WriteLine("Generating test cases ... " );
      
                      Stopwatch sw = new Stopwatch();
                      sw.Start();
      
                      for (int i = 0; i < testArraysNum; i++)
                      {
                          int len = rand.Next(maxArrayLen) + 1;
                          a[i] = new ushort[len];
                          for (int j = 0; j < len; j++)
                          {
                              a[i][j] = (ushort) rand.Next(maxValue);
                          }
      
      
      
                          len = rand.Next(maxArrayLen) + 1;
                          b[i] = new ushort[len];
                          for (int j = 0; j < len; j++)
                          {
                              b[i][j] = (ushort) rand.Next(maxValue);
                          }
      
      
      
                      }
      
                      sw.Stop();
                      Console.WriteLine("Done. ({0} milliseconds)", sw.ElapsedMilliseconds);
      
      
                      //compare1
                      sw.Restart();
                      for (int i = 0; i < testArraysNum; i++)
                      {
                          int result = Compare1(a[i], b[i]);
                      }
                      sw.Stop();
                      Console.WriteLine("Compare1 took " + sw.ElapsedMilliseconds.ToString() + " milliseconds");
      
                      //compare2
                      sw.Restart();
                      for (int i = 0; i < testArraysNum; i++)
                      {
                          int result = Compare2(a[i], b[i]);
                      }
                      sw.Stop();
                      Console.WriteLine("Compare2 took " + sw.ElapsedMilliseconds.ToString() + " milliseconds");
      
                      //compare3
                      sw.Restart();
                      for (int i = 0; i < testArraysNum; i++)
                      {
                          int result = Compare3(a[i], b[i]);
                      }
                      sw.Stop();
                      Console.WriteLine("Compare3 took " + sw.ElapsedMilliseconds.ToString() + " milliseconds");
      
      
                      //Generate "similar" arrays;
      
                      Console.WriteLine("Generating 'similar' test cases ... ");
      
                      sw.Restart();
      
                      for (int i = 0; i < testArraysNum; i++)
                      {
                          int len = rand.Next(maxArrayLen - minArrayLen) + minArrayLen -1;
                          a[i] = new ushort[len];
                          for (int j = 0; j < len; j++)
                          {
                              if (j < len/2)
                                  a[i][j] = (ushort)j;
                              else
                                  a[i][j] = (ushort)(rand.Next(2)  + j);
                          }
      
      
      
                          len = rand.Next(maxArrayLen - minArrayLen) + minArrayLen - 1;
                          b[i] = new ushort[len];
                          for (int j = 0; j < len; j++)
                          {
                              if (j < len/2)
                                  b[i][j] = (ushort)j;
                              else
                                  b[i][j] = (ushort)(rand.Next(2)  + j);
                          }
                      }
      
                      sw.Stop();
                      Console.WriteLine("Done. ({0} milliseconds)", sw.ElapsedMilliseconds);
      
      
                      //compare1
                      sw.Restart();
                      for (int i = 0; i < testArraysNum; i++)
                      {
                          int result = Compare1(a[i], b[i]);
                      }
                      sw.Stop();
                      Console.WriteLine("Compare1 took " + sw.ElapsedMilliseconds.ToString() + " milliseconds");
      
                      //compare2
                      sw.Restart();
                      for (int i = 0; i < testArraysNum; i++)
                      {
                          int result = Compare2(a[i], b[i]);
                      }
                      sw.Stop();
                      Console.WriteLine("Compare2 took " + sw.ElapsedMilliseconds.ToString() + " milliseconds");
      
                      //compare3
                      sw.Restart();
                      for (int i = 0; i < testArraysNum; i++)
                      {
                          int result = Compare3(a[i], b[i]);
                      }
                      sw.Stop();
                      Console.WriteLine("Compare3 took " + sw.ElapsedMilliseconds.ToString() + " milliseconds");
      
      
                      Console.ReadKey();
                  }
      
                  public static int Compare1(ushort[] x, ushort[] y)
                  {
                      int pos = 0;
                      int len = Math.Min(x.Length, y.Length);
                      while (pos < len && x[pos] == y[pos])
                          pos++;
      
                      return pos < len ?
                          x[pos].CompareTo(y[pos]) :
                          x.Length.CompareTo(y.Length);
                  }
      
                  public unsafe static int Compare2(ushort[] x, ushort[] y)
                  {
                      int pos = 0;
                      int len = Math.Min(x.Length, y.Length);
                      fixed (ushort* fpx = &x[0], fpy = &y[0])
                      {
                          ushort* px = fpx;
                          ushort* py = fpy;
                          while (pos < len && *px == *py)
                          {
                              px++;
                              py++;
                              pos++;
                          }
                      }
      
                      return pos < len ?
                          x[pos].CompareTo(y[pos]) :
                          x.Length.CompareTo(y.Length);
                  }
      
                  public static int Compare3(ushort[] x, ushort[] y)
                  {
                      int pos = 0;
                      int len = Math.Min(x.Length, y.Length);
      
                      // the below is probably not worth it for less than 5 (or so) elements,
                      //   so just do the old way
                      if (len < 5)
                      {
                          while (pos < len && x[pos] == y[pos])
                              ++pos;
      
                          return pos < len ?
                            x[pos].CompareTo(y[pos]) :
                            x.Length.CompareTo(y.Length);
                      }
      
                      ushort lastX = x[len - 1];
                      bool lastSame = true;
                      if (x[len - 1] == y[len - 1])
                          --x[len - 1]; // can be anything else
                      else
                          lastSame = false;
      
                      while (x[pos] == y[pos])
                          ++pos;
      
                      return pos < len - 1 ?
                          x[pos].CompareTo(y[pos]) :
                          lastSame ? x.Length.CompareTo(y.Length)
                                   : lastX.CompareTo(y[len - 1]);
                  }
              }
          }
      

答案 3 :(得分:1)

只是一些想法(可能是错误的,需要测试):

首先。较大的项目类型(例如x32的int或x64的long - 让我们命名此类型TLong)可以提供更好的性能。
如果您在ushort项目中打包多个TLong项目(按big-endian订单!),您将能够一次比较多个项目。
但如果没有填满,你将需要处理[TLong]类型的新数组的最后一项。可能有一些'棘手的案例'。我现在没有看到,但我不确定。

第二。甚至更多!在某些情况下,我们可以在TLong类型的项目中打包更多原始项目。
让我们回到ushort类型的初始数组:让我们假设K - 数组中存在的最大数字(即你想要排序的所有路径!)(即每个数组) t存储在每个ushort中的数字t <= Kt。让我们假设每个K只是基础 - ushort数字系统中的“数字”。
这意味着图中的每个路径(即每个ushort数组)仅确定此数字系统中的数字。所以不要操纵K数组,而是需要像这样做:

  1. 确定TLong符合类型p的最强大功能 - 让我们假设它是int p = 0; while (Math.Exp(K, p) <= TLong.MaxValue) p++;

    p
  2. 获取ushort数组的第K项,并在base - TLong数字系统中计算适当的数字,并将其保存为{{{i}项1}}数组:

    List<TLong> num = new List<TLong>();
    int i = 0;
    while (p * i < a.Length)
    {
        TLong y = 0;
    
        //transform from base-10 to base-K in chunks of p elements
        for (int j = p * i; j < Math.Min(p * (i + 1), a.Length); j++)
            y = y * K + a[j];
    
        num.Add(sum);
        i++;
    }
    TLong[] result = num.ToArray();
    
  3. 这是一个动态预先计算的转换,因此对于不同的HTML文档K可能会有所不同,如果K远小于255,它将比第一个想法更快。此外,预先计算的变换具有线性复杂度,因此不会对您的性能产​​生很大影响。

  4. 通常,您将数组转换为基数 - K num中的大数字。 SYS。存储在数组中。就是这样!
  5. 并且无需更改初始排序算法以及其他评论的改进(无论如何都要检查 - 我可能错了!)
  6. 我希望它会对你有所帮助。