大家好我是C++
的新手,并且一直潜入搜索和排序算法,并决定尝试编写自己的算法。这是我的原型:
void quicksort(int data[ ], size_t n);
// Precondition: data is an array with at least n components.
// Postcondition: The elements of data have been rearranged so
// that data[0] <= data[1] <= ... <= data[n-1].
void partition(int data[ ], size_t n, size_t& pivot_index);
// Precondition: n > 1, and data is an array (or subarray)
// with at least n elements.
// Postcondition: The function has selected some "pivot value"
// that occurs in data[0]..data[n-1]. The elements of data
// have then been rearranged, and the pivot index set so that:
// -- data[pivot_index] is equal to the pivot;
// -- Each item before data[pivot_index] is <= the pivot;
// -- Each item after data[pivot_index] is > the pivot.
void setPivot(int data[ ], size_t n);
// Precondition: n > 1 and data is an array or subarray
// Postcondition: data[0] holds the selected pivot value
// The original value of data[0] has been swapped with the selected pivot value
我写了main
测试:
int main( )
{
// Announce the program
cout << "\nImplementing the QuickSort Algorithm\n";
// Declare useful values
const char BLANK = ' ';
size_t i = 0;
// Initialize our test data arrays
const size_t SIZE1 = 10;
int data1[]= {34, 33, 9, 45, 1, -1, 9, -18, 75, 100 };
const size_t SIZE2 = 15; // Number of elements in the array to be sorted
int data2[]= {100, 99, 98, 97, 96, 95, 94, 93, 92, 91, 90, 89, 88, 87, 86 };
const size_t SIZE3 = 1000;
int data3[SIZE3];
// Initialize the third array to random int values
for (i = 0; i < SIZE3; i++)
data3[i] = rand();
// Beginning of quick sort tests
// Sort the arrays and print the result with two blanks after each number
quicksort(data1, SIZE1);
cout << "\nSorted First Array: " << endl;
for (i = 0; i < SIZE1; i++)
cout << data1[i] << BLANK << BLANK;
cout << endl;
quicksort(data2, SIZE2);
cout << "\nSorted Second Array: " << endl;
for (i = 0; i < SIZE2; i++)
cout << data2[i] << BLANK << BLANK;
cout << endl;
// On the large third array, just print the first ten and last ten values
quicksort(data3, SIZE3);
cout << "\nSorted Third Array (first ten): " << endl;
for (i = 0; i < 10; i++)
cout << data3[i] << BLANK << BLANK;
cout << endl;
cout << "Sorted Third Array (last ten): " << endl;
for (i = SIZE3 - 10; i < SIZE3; i++)
cout << data3[i] << BLANK << BLANK;
cout << endl << endl;
system("Pause");
return EXIT_SUCCESS;
}
我已经开始了函数定义:
void quicksort(int data[ ], size_t n)
// Library facilities used: cstdlib
{
size_t pivot_index; // Array index for the pivot element
size_t n1; // Number of elements before the pivot element
size_t n2; // Number of elements after the pivot element
if (n > 1)
{
// Partition the array, and set the pivot index.
partition(data, n, pivot_index);
// Compute the sizes of the subarrays.
n1 = pivot_index;
n2 = n - n1 - 1;
// Recursive calls will now sort the subarrays.
quicksort(data, n1);
quicksort((data + pivot_index + 1), n2);
}
}
void partition(int data[ ], size_t n, size_t& pivot_index)
// Library facilities used: algorithm, cstdlib
{
assert(n > 1);
setPivot(data, n);
}
void setPivot(int data[ ], size_t n)
// Library facilties used: algorithm, cstdlib
// This function chooses a pivot value as the median of three
// randomly selected values. The selected pivot is swapped with
// data[0] so that the pivot value is in the first position of the array
{
assert(n > 1);
}
我的问题是在速度方面完成void partition(int data[], size_t n, size_t& pivot_index
和void setPivot(int data[], size_t n)
的最佳方法是什么?
答案 0 :(得分:0)
这是一个棘手的问题要回答,我喜欢你用双脚跳进去。因此,在构建编译器时,您可以根据某些指令与另一个指令进行比较的时间单位来选择指令。虽然您可以使用Raymond的建议并且只使用预先构建的代码,但您不会从中学到很多东西。比较排序算法中两个最昂贵的指令是比较和数组中的交换,因此您希望最小化这些指令。
我将首先讨论一个好的分区算法背后的数学。让我们假设数据枢轴在数据[0]中,并考虑最小化这些昂贵的掉期的最佳方法。使用数组...
[34, 33, 9, 45, 1, -1, 9, -18, 75, 100]
良好的分区可以最大限度地减少我们移动枢轴的次数,因为我们无法最大限度地减少我们移动某些东西的次数(如果可能的话,我们不会进行排序)。
wall:=1
for(i:=1; i < n; ++i){
if(data[0]>data[i]){
swap(data, wall, i)
++wall
}
}
swap(data, wall-1, 0)
pivot_index:=wall-1
现在回答为什么这是最有效的方法之一。
接下来是一种选择枢轴的聪明方法。在评论中它表示中位数,但我将改变它意味着统计原因并坚持选择三个随机数。 快速排序的运行时复杂性与枢轴的选择有关。选择错误,它可能是O(n ^ 2),选择正确,它将近似为O(n * log(n))。枢轴的最佳选择是将数组均分为两个相等长度的数字。 50%的数据点高于数据点,50%低于数据点,e.i。均值。计算人口的平均值会很慢并破坏我们得到的任何表现,如果我们必须计算方法的平均值,那么我们只会使用样本的平均值来估计人口的平均值。
void setPivot(int data[ ], size_t n)
index1:= rand_uniform()*n
index2:= rand_uniform()*n
index3:= rand_uniform()*n
mean:=(data[index1]+data[index2]+data[index3])/3
offset1:=abs(index1-mean)
offset2:=abs(index2-mean)
offset3:=abs(index3-mean)
if(offset1 < offset2 && offset1 < offset3)
swap(data,0,index1)
else if(offset2 < offset1 && offset2 < offset3)
swap(data,0,index2)
else if(offset3 < offset1 && offset3 < offset2)
swap(data,0,index3)
应该很清楚,这有很多开销这样做,但是setPivot处于恒定的时间复杂度,意味着对于足够大的n,它更快地这样做,因为所选择的piviot在统计上将数据划分为两个相等的部分更经常,这是一个良好的分而治之算法的全部要点。