解析mergesort例程

时间:2012-10-04 18:43:26

标签: algorithm delphi sorting computer-science mergesort

免责声明:这是一个需要对代码和算法进行一些解释的问题。它不是为了解决任何事情或优化任何事情,而是为了促进理解。

我对排序例程的理解并不是很好。我请求帮助将已经可用的mergesort代码从integer type转换为string typedelphi mergesort for string arrays。收到答案后,我开始了解分拣程序。

有一些资源可以帮助理解:

  1. http://www.iti.fh-flensburg.de/lang/algorithmen/sortieren/merge/mergen.htm
  2. http://www.youtube.com/watch?v=9Qk1t66g7IU
  3. 我试图剖析代码以跟随它。这个问题不是我尝试验证我自己对mergesort的理解,而是以清晰的方式显示排序例程。这个问题的价值在于那些试图更好地理解mergesort的人。这是必不可少的,因为如果你能很好地理解一个原型,就可以更容易地理解其他种类。

    我的问题是为什么我们添加“1”来设置长度和“结果”

    SetLength(AVals, Length(Vals) div 2 + 1);
    Result := 1 + PerformMergeSort(0, High(Vals));
    

    为什么我们在这里减去“1”? 编辑:我认为如果不减去1,K将超出范围?

    Result := k - 1;
    

    这是这个问题的代码; BTW这是一个优化的mergesort,因为它只复制了一半的数组:

    function MergeSortRemoveDuplicates(var Vals: array of Integer):Integer;
    var
      AVals: array of Integer;
    
       //returns index of the last valid element
      function Merge(I0, I1, J0, J1: Integer):Integer;
      var
        i, j, k, LC:Integer;
      begin
        LC := I1 - I0;
        for i := 0 to LC do
          AVals[i]:=Vals[i + I0];
          //copy lower half or Vals into temporary array AVals
    
        k := I0;
        i := 0;
        j := J0;
        while ((i <= LC) and (j <= J1)) do
        if (AVals[i] < Vals[j]) then begin
          Vals[k] := AVals[i];
          inc(i);
          inc(k);
        end else  if (AVals[i] > Vals[j]) then begin
          Vals[k]:=Vals[j];
          inc(k);
          inc(j);
        end else begin //duplicate
          Vals[k] := AVals[i];
          inc(i);
          inc(j);
          inc(k);
        end;
    
        //copy the rest
        while i <= LC do begin
          Vals[k] := AVals[i];
          inc(i);
          inc(k);
        end;
    
        if k <> j then
          while j <= J1 do begin
            Vals[k]:=Vals[j];
            inc(k);
            inc(j);
          end;
    
        Result := k - 1;
      end;
    
     //returns index of the last valid element
    
      function PerformMergeSort(ALo, AHi:Integer): Integer; //returns
      var
        AMid, I1, J1:Integer;
      begin
    
      //It would be wise to use Insertion Sort when (AHi - ALo) is small (about 32-100)
        if (ALo < AHi) then
        begin
          AMid:=(ALo + AHi) shr 1;
          I1 := PerformMergeSort(ALo, AMid);
          J1 := PerformMergeSort(AMid + 1, AHi);
          Result := Merge(ALo, I1, AMid + 1, J1);
        end else
          Result := ALo;
      end;
    
    begin
      SetLength(AVals, Length(Vals) div 2 + 1);
      Result := 1 + PerformMergeSort(0, High(Vals));
    end;
    

    这是我对非常小修改的理解:

    function MergeSortRemoveDuplicates(var Vals: array of Integer):Integer;
    var
      AVals: array of Integer;
    
       //returns index of the last valid element
      function Merge(I0, I1, J0, J1: Integer):Integer;
      var
        i, j, k, LC:Integer;
      begin
        // difference between mid-point on leftside
        // between low(Original_array) and midpoint(true Original_array midpoint)
        // subtracting I0 which is Low(Original_array)
        // or here equals zero(0)
        // so LC is quarter point in Original_array??
        LC := I1 - I0;
    
        // here we walk from begining of array
        // and copy the elements between zero and LC
        // this is funny call that Vals[i + I0] like 0 + 0
        // then 1 + 0 and so on. I guess this guarantees if we are
        // starting from non-zero based array??
        for i := 0 to LC do
          AVals[i]:=Vals[i + I0];
    
    
        // k equal low(Original_array)
        k := I0;
    
        // I will be our zero based counter element
        i := 0;
    
        // J will be (midpoint + 1) or
        // begining element of right side of array
        j := J0;
    
        // while we look at Copy_array elements
        // between first element (low(Copy_array)
        // and original_array from midpoint + 1 to high(Original_array)
        // we start to sort it
        while ((i <= LC) and (j <= J1)) do
    
        // if the value at Copy_array is smaller than the Original_array
        // we move it to begining of Original_array
        // remember position K is first element
        if (AVals[i] < Vals[j]) then begin
          Vals[k] := AVals[i];
    
          // move to next element in Copy_array
          inc(i);
    
          // move to next element in Original_array
          inc(k);
    
        // if the value at copy_array is larger
        // then we move smaller value from J Original_array (J is midpoint+1)
        // to position K original_array (K now is the lower part of ) Original_array)
        end else  if (AVals[i] > Vals[j]) then begin
          Vals[k]:=Vals[j];
    
          //move K to the next element in Original_array
          inc(k);
    
          // move j to next element in Original_array
          inc(j);
    
        // if the value in Original_array is equal to the element in Copy_array
        // do nothing and count everything up
        // so we end up with one copy from duplicate and disregard the rest
        end else begin //duplicate
          Vals[k] := AVals[i];
          inc(i);
          inc(j);
          inc(k);
        end;
    
        //copy the rest
        while i <= LC do begin
          Vals[k] := AVals[i];
          inc(i);
          inc(k);
        end;
    
        // if the counters do not endup at the same element
        // this means we have some that maybe leftover on
        // the right side of the Original_array.
        // This explains why K does not equal J : there are still elements left over
        // then copy them to Original_array
        // starting at position K.
        if k <> j then
          while j <= J1 do begin
            Vals[k]:=Vals[j];
            inc(k);
            inc(j);
          end;
    
        // why K - 1?
        // function needs result so return will be null if called
        // I don't understand this part
        Result := k - 1;
      end;
    
     //returns index of the last valid element
    
      function PerformMergeSort(ALo, AHi:Integer): Integer; //returns
      var
        AMid, I1, J1:Integer;
      begin
    
      //It would be wise to use Insertion Sort when (AHi - ALo) is small (about 32-100)
        if (ALo < AHi) then
        begin
          AMid:=(ALo + AHi) shr 1;    // midpoint
          I1 := PerformMergeSort(ALo, AMid);  //recursive call I1 is a data point on the left
          J1 := PerformMergeSort(AMid + 1, AHi);  // recursive call I1 is a data point on the right
          Result := Merge(ALo, I1, AMid + 1, J1);
        end else
          Result := ALo;
      end;
    
    begin
      // test if array is even then we can split nicely down middle
      if Length(Vals) mod 2 = 0 then
      begin
        SetLength(AVals, Length(Vals) shr 1);
        Result := PerformMergeSort(0, High(Vals));
      end
      else
      //array is odd let us add 1 to it and make it even
      // shr 1 is essentially dividing by 2 but doing it on the bit level
      begin
        SetLength(AVals, (Length(Vals) + 1) shr 1);
        Result := PerformMergeSort(0, High(Vals));
      end;
    end;
    

1 个答案:

答案 0 :(得分:2)

这是我对作者提供的代码的修改,旨在删除排序过程中的重复项。一些解释:

外部功能:

  1. 我们应该提供缓冲区(AVals)来存储初始aray的一半。 长度(Vals)div 2 + 1 为奇数和偶数大小的数组提供了足够的空间,没有不必要的复杂化。更好的价值(适用于所有情况):长度(Vals + 1)div 2
  2. 内部程序PerformMergeSort返回最后一个有效元素的索引,但外部程序返回有效元素的 count (在引用的主题中有注释),所以我用(1 + PerformMergeSort())。

    原因:在内部我们必须使用索引,但此过程的最终用户应该知道新的数组长度。

  3. 内部功能PerformMergeSort:

    它接受数组块的开始和结束索引,对此块进行排序并返回最后一个有效元素的索引。在递归调用之后,我们遇到了这种情况。 不变量:两个块都是排序的,它们不包含重复项,左段的非零长度

    *****ACDEFG****BCDEGHILM******
         ^    ^    ^       ^
         |    |    |       |
         Alo  I1   AMid+1  J1 
         I0   I1   J0      J1  //as named in Merge 
         \____/
           LC+1 elements 
    

    合并后:

    *****ABCDEFGHILM**************
         ^         ^^
         |         ||__k
         |         | 
         Alo       Result       
    

    内部功能合并:

    使用提供的示例,笔和纸,逐步合并,看看它是如何工作的。

    关于复制周期:我们将(LC + 1)元素复制到临时缓冲区AVAL,使用AVals的起始段(始终从0开始)和主阵列的正确段(从I0开始,通常为非零) )