傻瓜合并排序的解释

时间:2012-05-08 16:26:16

标签: python algorithm sorting mergesort

我在网上找到了这个代码:

def merge(left, right):
    result = []
    i ,j = 0, 0
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1
    result += left[i:]
    result += right[j:]
    return result

def mergesort(list):
    if len(list) < 2:
        return list
    middle = len(list) / 2
    left = mergesort(list[:middle])
    right = mergesort(list[middle:])
    return merge(left, right)

当我运行它时它100%工作。我只是不知道合并排序的工作原理或递归函数如何正确地命令左右两种。

8 个答案:

答案 0 :(得分:57)

我认为理解合并排序的关键是理解以下原则 - 我将其称为合并原则:

  

给定两个单独的列表A和B从最小到最大排序,通过重复比较A的最小值和B的最小值来构造列表C,删除较小的值,并将其附加到C.当一个列表是耗尽,将其他列表中的其余项目按顺序附加到C上。列表C也是排序列表。

如果你手工操作几次,你会发现它是正确的。例如:

A = 1, 3
B = 2, 4
C = 
min(min(A), min(B)) = 1

A = 3
B = 2, 4
C = 1
min(min(A), min(B)) = 2

A = 3
B = 4
C = 1, 2
min(min(A), min(B)) = 3

A = 
B = 4
C = 1, 2, 3

现在A已经用尽,所以使用B中的剩余值扩展C:

C = 1, 2, 3, 4

合并原则也很容易证明。 A的最小值小于A的所有其他值,B的最小值小于B的所有其他值。如果A的最小值小于B的最小值,则它也必须小于比所有B值都要小。因此它小于A的所有值和B的所有值。

因此,只要您继续将符合这些条件的值附加到C,您就会得到一个排序列表。这就是上面merge函数的作用。

现在,根据这个原则,很容易理解一种排序技术,它通过将列表分成较小的列表,对这些列表进行排序,然后将这些排序的列表合并在一起来对列表进行排序。 merge_sort函数只是一个函数,它将列表分成两半,对这两个列表进行排序,然后以上述方式将这两个列表合并在一起。

唯一的问题是,因为它是递归的,所以当它对两个子列表进行排序时,它会通过将它们传递给它自己来实现!如果你在这里理解递归有困难,我建议先研究更简单的问题。但是如果你已经掌握了递归的基础知识,那么你必须意识到的是,单项列表已经被排序了。合并两个单项列表会生成一个已排序的两项列表;合并两个两项目列表会生成一个已排序的四项目列表;等等。

答案 1 :(得分:15)

当我偶然发现难以理解算法是如何工作的时候,我会添加调试输出来检查算法中究竟发生了什么。

这里的代码带有调试输出。尝试通过递归调用mergesort以及merge对输出执行的操作来理解所有步骤:

def merge(left, right):
    result = []
    i ,j = 0, 0
    while i < len(left) and j < len(right):
        print('left[i]: {} right[j]: {}'.format(left[i],right[j]))
        if left[i] <= right[j]:
            print('Appending {} to the result'.format(left[i]))           
            result.append(left[i])
            print('result now is {}'.format(result))
            i += 1
            print('i now is {}'.format(i))
        else:
            print('Appending {} to the result'.format(right[j]))
            result.append(right[j])
            print('result now is {}'.format(result))
            j += 1
            print('j now is {}'.format(j))
    print('One of the list is exhausted. Adding the rest of one of the lists.')
    result += left[i:]
    result += right[j:]
    print('result now is {}'.format(result))
    return result

def mergesort(L):
    print('---')
    print('mergesort on {}'.format(L))
    if len(L) < 2:
        print('length is 1: returning the list withouth changing')
        return L
    middle = len(L) / 2
    print('calling mergesort on {}'.format(L[:middle]))
    left = mergesort(L[:middle])
    print('calling mergesort on {}'.format(L[middle:]))
    right = mergesort(L[middle:])
    print('Merging left: {} and right: {}'.format(left,right))
    out = merge(left, right)
    print('exiting mergesort on {}'.format(L))
    print('#---')
    return out


mergesort([6,5,4,3,2,1])

输出:

---
mergesort on [6, 5, 4, 3, 2, 1]
calling mergesort on [6, 5, 4]
---
mergesort on [6, 5, 4]
calling mergesort on [6]
---
mergesort on [6]
length is 1: returning the list withouth changing
calling mergesort on [5, 4]
---
mergesort on [5, 4]
calling mergesort on [5]
---
mergesort on [5]
length is 1: returning the list withouth changing
calling mergesort on [4]
---
mergesort on [4]
length is 1: returning the list withouth changing
Merging left: [5] and right: [4]
left[i]: 5 right[j]: 4
Appending 4 to the result
result now is [4]
j now is 1
One of the list is exhausted. Adding the rest of one of the lists.
result now is [4, 5]
exiting mergesort on [5, 4]
#---
Merging left: [6] and right: [4, 5]
left[i]: 6 right[j]: 4
Appending 4 to the result
result now is [4]
j now is 1
left[i]: 6 right[j]: 5
Appending 5 to the result
result now is [4, 5]
j now is 2
One of the list is exhausted. Adding the rest of one of the lists.
result now is [4, 5, 6]
exiting mergesort on [6, 5, 4]
#---
calling mergesort on [3, 2, 1]
---
mergesort on [3, 2, 1]
calling mergesort on [3]
---
mergesort on [3]
length is 1: returning the list withouth changing
calling mergesort on [2, 1]
---
mergesort on [2, 1]
calling mergesort on [2]
---
mergesort on [2]
length is 1: returning the list withouth changing
calling mergesort on [1]
---
mergesort on [1]
length is 1: returning the list withouth changing
Merging left: [2] and right: [1]
left[i]: 2 right[j]: 1
Appending 1 to the result
result now is [1]
j now is 1
One of the list is exhausted. Adding the rest of one of the lists.
result now is [1, 2]
exiting mergesort on [2, 1]
#---
Merging left: [3] and right: [1, 2]
left[i]: 3 right[j]: 1
Appending 1 to the result
result now is [1]
j now is 1
left[i]: 3 right[j]: 2
Appending 2 to the result
result now is [1, 2]
j now is 2
One of the list is exhausted. Adding the rest of one of the lists.
result now is [1, 2, 3]
exiting mergesort on [3, 2, 1]
#---
Merging left: [4, 5, 6] and right: [1, 2, 3]
left[i]: 4 right[j]: 1
Appending 1 to the result
result now is [1]
j now is 1
left[i]: 4 right[j]: 2
Appending 2 to the result
result now is [1, 2]
j now is 2
left[i]: 4 right[j]: 3
Appending 3 to the result
result now is [1, 2, 3]
j now is 3
One of the list is exhausted. Adding the rest of one of the lists.
result now is [1, 2, 3, 4, 5, 6]
exiting mergesort on [6, 5, 4, 3, 2, 1]
#---

答案 2 :(得分:4)

有几种方法可以帮助您自己理解这一点:

逐步调试调试器中的代码并观察发生的情况。 要么, 在纸上逐步完成(有一个非常小的例子)并观察会发生什么。

(我个人认为在纸上做这种事情更有启发性)

从概念上讲,它的工作原理如下: 输入列表通过减半而变得越来越小(例如list[:middle]是前半部分)。每一半一次又一半减半,直到它的长度小于2。直到它什么都没有或单个元素。然后通过合并例程将这些单独的片段放回到一起,方法是将2个子列表附加或交错到result列表,从而得到一个排序列表。由于必须对2个子列表进行排序,因此追加/交错是一种快速( O(n))操作。

这个(在我看来)的关键不是合并例程,一旦你理解了对它的输入将始终被排序,这是非常明显的。 “技巧”(我使用引号,因为它不是一个技巧,它的计算机科学:-))是为了保证合并的输入被排序,你必须保持递归,直到你到达的列表必须进行排序,这就是为什么你要递归调用mergesort直到列表长度少于2个元素。

递归和扩展合并排序,在您第一次遇到它们时可能是不明显的。您可能想要查阅一本好的算法书(例如,DPV可以在线,合法和免费获得),但是您可以通过逐步完成您的代码获得很长的路要走。如果你真的想要进入它,斯坦福/ Coursera algo course将很快再次运行,他将详细介绍Merge排序。

如果确实想要理解它,请阅读该书参考的第2章,然后丢弃上面的代码并从头开始重新编写。严重。

答案 3 :(得分:4)

合并排序一直是我最喜欢的算法之一。

您从较短的排序序列开始,并按顺序将它们合并为更大的排序序列。这么简单。

递归部分意味着你正在向后工作 - 从整个序列开始并对两半进行排序。每一半也是分开的,直到序列中只有零个或一个元素时排序变得微不足道。当递归函数返回时,排序的序列变得更大,就像我在初始描述中所说的那样。

答案 4 :(得分:1)

一张图片胜过千言万语,动画价值一万。

查看从Wikipedia获取的以下动画,它将帮助您直观地了解合并排序算法的实际工作方式。

Merge Sort

详细animation with explanation用于好奇的排序过程中的每个步骤。

各种类型的排序算法 interesting animation

答案 5 :(得分:0)

基本上你得到你的列表,然后你拆分它然后对它进行排序,但你递归地应用这个方法,所以你最终再次拆分它,然后再次,直到你有一个简单的设置,你可以轻松排序,然后合并所有简单的解决方案,以获得完全排序的数组。

答案 6 :(得分:0)

您可以很好地了解合并排序的工作方式:

http://www.ee.ryerson.ca/~courses/coe428/sorting/mergesort.html

我希望它有所帮助。

答案 7 :(得分:0)

正如Wikipedia文章所解释的,有许多有价值的方法可以完成合并排序。完成合并的方式还取决于要合并的事物的集合,某些集合支持集合可以使用的某些工具。

我不打算用Python回答这个问题,因为我无法写出来;然而,参与&#34;合并排序&#34;算法似乎真的是问题的核心,在很大程度上。帮助我的资源是K.I.T.E在算法(由教授编写)上相当过时webpage,仅仅是因为内容的作者消除了具有上下文意义的标识符。

我的答案来自这个资源。

请记住,合并排序算法的工作原理是拆开提供的集合,然后再将每个单独的部分放在一起,在重建集合时将各个部分相互比较。

这里是&#34;代码&#34; (看看结尾的Java&#34;小提琴&#34;):

public class MergeSort {

/**
 * @param a     the array to divide
 * @param low   the low INDEX of the array
 * @param high  the high INDEX of the array
 */
public void divide (int[] a, int low, int high, String hilo) {


    /* The if statement, here, determines whether the array has at least two elements (more than one element). The
     * "low" and "high" variables are derived from the bounds of the array "a". So, at the first call, this if 
     * statement will evaluate to true; however, as we continue to divide the array and derive our bounds from the 
     * continually divided array, our bounds will become smaller until we can no longer divide our array (the array 
     * has one element). At this point, the "low" (beginning) and "high" (end) will be the same. And further calls 
     * to the method will immediately return. 
     * 
     * Upon return of control, the call stack is traversed, upward, and the subsequent calls to merge are made as each 
     * merge-eligible call to divide() resolves
     */
    if (low < high) {
        String source = hilo;
        // We now know that we can further divide our array into two equal parts, so we continue to prepare for the division 
        // of the array. REMEMBER, as we progress in the divide function, we are dealing with indexes (positions)

        /* Though the next statement is simple arithmetic, understanding the logic of the statement is integral. Remember, 
         * at this juncture, we know that the array has more than one element; therefore, we want to find the middle of the 
         * array so that we can continue to "divide and conquer" the remaining elements. When two elements are left, the
         * result of the evaluation will be "1". And the element in the first position [0] will be taken as one array and the
         * element at the remaining position [1] will be taken as another, separate array.
         */
        int middle = (low + high) / 2;

        divide(a, low, middle, "low");
        divide(a, middle + 1, high, "high");


        /* Remember, this is only called by those recursive iterations where the if statement evaluated to true. 
         * The call to merge() is only resolved after program control has been handed back to the calling method. 
         */
        merge(a, low, middle, high, source);
    }
}


public void merge (int a[], int low, int middle, int high, String source) {
// Merge, here, is not driven by tiny, "instantiated" sub-arrays. Rather, merge is driven by the indexes of the 
// values in the starting array, itself. Remember, we are organizing the array, itself, and are (obviously
// using the values contained within it. These indexes, as you will see, are all we need to complete the sort.  

    /* Using the respective indexes, we figure out how many elements are contained in each half. In this 
     * implementation, we will always have a half as the only way that merge can be called is if two
     * or more elements of the array are in question. We also create to "temporary" arrays for the 
     * storage of the larger array's elements so we can "play" with them and not propogate our 
     * changes until we are done. 
     */
    int first_half_element_no       = middle - low + 1;
    int second_half_element_no      = high - middle;
    int[] first_half                = new int[first_half_element_no];
    int[] second_half               = new int[second_half_element_no];

    // Here, we extract the elements. 
    for (int i = 0; i < first_half_element_no; i++) {  
        first_half[i] = a[low + i]; 
    }

    for (int i = 0; i < second_half_element_no; i++) {  
        second_half[i] = a[middle + i + 1]; // extract the elements from a
    }

    int current_first_half_index = 0;
    int current_second_half_index = 0;
    int k = low;


    while (current_first_half_index < first_half_element_no || current_second_half_index < second_half_element_no) {

        if (current_first_half_index >= first_half_element_no) {
            a[k++] = second_half[current_second_half_index++];
            continue;
        }

        if (current_second_half_index >= second_half_element_no) {
            a[k++] = first_half[current_first_half_index++];
            continue;
        }

        if (first_half[current_first_half_index] < second_half[current_second_half_index]) {
            a[k++] = first_half[current_first_half_index++];
        } else {
            a[k++] = second_half[current_second_half_index++];
        }
    }
}

我还有一个版本here,它会打印出有用的信息,并提供更直观的上述内容。如果有帮助,语法高亮也会更好。