使用java中实现的中位数为quickselect选择pivot?

时间:2013-12-23 21:06:12

标签: java algorithm median median-of-medians quickselect

我在github中找到了quickselect算法的代码,也称为order-statistics。这段代码工作正常。

我不理解medianOf3方法,它应该按排序顺序排列第一个,中间和最后一个索引。但实际上在调用medianof3方法后输出数组时并不是这样。 除了最后一次调用swap(list, centerIndex, rightIndex - 1);之外,我可以按照这种方法进行操作。有谁可以解释为什么这个被称为?

import java.util.Arrays;



/**
* This program determines the kth order statistic (the kth smallest number in a
* list) in O(n) time in the average case and O(n^2) time in the worst case. It
* achieves this through the Quickselect algorithm.
*
* @author John Kurlak <john@kurlak.com>
* @date 1/17/2013
*/
public class Quickselect {
   /**
* Runs the program with an example list.
*
* @param args The command-line arguments.
*/
   public static void main(String[] args) {
       int[] list = { 3, 5, 9, 10, 7, 40, 23, 45, 21, 2 };
       int k = 6;
       int median = medianOf3(list, 0, list.length-1);
       System.out.println(median);
       System.out.println("list is "+ Arrays.toString(list));
       Integer kthSmallest = quickselect(list, k);

       if (kthSmallest != null) {
           System.out.println("The kth smallest element in the list where k=" + k + " is " + kthSmallest + ".");
       } else {
           System.out.println("There is no kth smallest element in the list where k=" + k + ".");
       }
       System.out.println(Arrays.toString(list));
   }

   /**
* Determines the kth order statistic for the given list.
*
* @param list The list.
* @param k The k value to use.
* @return The kth order statistic for the list.
*/
   public static Integer quickselect(int[] list, int k) {
       return quickselect(list, 0, list.length - 1, k);
   }

   /**
* Recursively determines the kth order statistic for the given list.
*
* @param list The list.
* @param leftIndex The left index of the current sublist.
* @param rightIndex The right index of the current sublist.
* @param k The k value to use.
* @return The kth order statistic for the list.
*/
   public static Integer quickselect(int[] list, int leftIndex, int rightIndex, int k) {
       // Edge case
       if (k < 1 || k > list.length) {
           return null;
       }

       // Base case
       if (leftIndex == rightIndex) {
           return list[leftIndex];
       }

       // Partition the sublist into two halves
       int pivotIndex = randomPartition(list, leftIndex, rightIndex);
       int sizeLeft = pivotIndex - leftIndex + 1;

       // Perform comparisons and recurse in binary search / quicksort fashion
       if (sizeLeft == k) {
           return list[pivotIndex];
       } else if (sizeLeft > k) {
           return quickselect(list, leftIndex, pivotIndex - 1, k);
       } else {
           return quickselect(list, pivotIndex + 1, rightIndex, k - sizeLeft);
       }
   }

   /**
* Randomly partitions a set about a pivot such that the values to the left
* of the pivot are less than or equal to the pivot and the values to the
* right of the pivot are greater than the pivot.
*
* @param list The list.
* @param leftIndex The left index of the current sublist.
* @param rightIndex The right index of the current sublist.
* @return The index of the pivot.
*/
   public static int randomPartition(int[] list, int leftIndex, int rightIndex) {
       int pivotIndex = medianOf3(list, leftIndex, rightIndex);
       int pivotValue = list[pivotIndex];
       int storeIndex = leftIndex;

       swap(list, pivotIndex, rightIndex);

       for (int i = leftIndex; i < rightIndex; i++) {
           if (list[i] <= pivotValue) {
               swap(list, storeIndex, i);
               storeIndex++;
           }
       }

       swap(list, rightIndex, storeIndex);

       return storeIndex;
   }

   /**
* Computes the median of the first value, middle value, and last value
* of a list. Also rearranges the first, middle, and last values of the
* list to be in sorted order.
*
* @param list The list.
* @param leftIndex The left index of the current sublist.
* @param rightIndex The right index of the current sublist.
* @return The index of the median value.
*/
   public static int medianOf3(int[] list, int leftIndex, int rightIndex) {
       int centerIndex = (leftIndex + rightIndex) / 2;

       if (list[leftIndex] > list[rightIndex]) {
           swap(list, leftIndex, centerIndex);
       }

       if (list[leftIndex] > list[rightIndex]) {
           swap(list, leftIndex, rightIndex);
       }

       if (list[centerIndex] > list[rightIndex]) {
           swap(list, centerIndex, rightIndex);
       }

       swap(list, centerIndex, rightIndex - 1);

       return rightIndex - 1;
   }

   /**
* Swaps two elements in a list.
*
* @param list The list.
* @param index1 The index of the first element to swap.
* @param index2 The index of the second element to swap.
*/
   public static void swap(int[] list, int index1, int index2) {
       int temp = list[index1];
       list[index1] = list[index2];
       list[index2] = temp;
   }
}

2 个答案:

答案 0 :(得分:0)

函数medianOf3用于定义左中和右的顺序。最后的陈述

swap(list, centerIndex, rightIndex - 1)

用于实现以下排序前提条件:

  

然而,   而不是递归到双方,如快速排序,快速选择   只有一个方面才能进入 - 一方是元素   寻找。这降低了平均复杂度O(n log n)(in   快速排序)到O(n)(快速选择)。

然后算法继续:

   for (int i = leftIndex; i < rightIndex; i++) {
       if (list[i] <= pivotValue) {
           swap(list, storeIndex, i);
           storeIndex++;
       }
   }

以便

  

枢轴左侧的值小于或等于   枢轴和枢轴右侧的值大于   支点。

答案 1 :(得分:0)

所以我编写了原始代码,但是我做得很差,使其可读。

回顾它,我不认为这行代码是必要的,但我认为这是一个小优化。如果我们删除代码行并返回centerIndex,它似乎没有任何问题。

不幸的是,它执行的优化应该从medianOf3()重构,然后转移到randomPartition()

基本上,优化是我们希望在分区之前尽可能“部分排序”我们的子阵列。原因是:我们的数据排序越多,我们未来的分区选择就越好,这意味着我们的运行时间有望比O(n ^ 2)更接近O(n)。在randomPartition()方法中,我们将数据透视值移动到我们正在查看的子数组的最右侧。这会将最右边的值移动到子阵列的中间。这是不希望的,因为最右边的值应该是“更大的值”。我的代码试图通过将数据透视索引放在最右边的索引旁边来防止这种情况。然后,当枢轴索引与randomPartition()中最右边的索引交换时,“较大”最右边的值不会移动到子阵列的中间,而是保持在右边。