我正在编写一个程序,该程序必须能够排序多达10亿随机Squares
。我在下面编写了一个小示例程序,它创建了ArrayList
的随机Squares
,然后用两种不同的方法对其进行排序。
当我在寻找一种有效的排序方法时,我发现使用Merge Sort
意味着最有效/最快。但是,当我将合并排序与自定义排序(不知道这种排序是否具有名称)进行比较时,我发现我写的排序效率更高。
我从我的程序中得到的输出是
比较器排序的时间(以纳秒为单位):2346757466
合并排序的时间(以纳秒为单位):24156585699
标准排序更快
那么为什么我写的那种排序比合并排序要快得多呢? 可以改进使用过的排序中的任何一种来进行更快,更有效的排序吗?
import java.security.SecureRandom;
import java.util.ArrayList;
import java.util.Comparator;
import java.util.Objects;
public class SortSquares {
public void run() {
ArrayList<Square> list = new ArrayList<Square>();
SecureRandom rand = new SecureRandom();
int randSize = 10;
for(int i = 1; i <= 10000000; i++)
list.add(new Square(i + rand.nextInt(randSize), i + rand.nextInt(randSize)));
//Create shallow copies to allow for timing
ArrayList<Square> comp = new ArrayList<Square>(list);
ArrayList<Square> merge = new ArrayList<Square>(list);
long startTime = System.nanoTime();
comp.sort(new SquareSort());
long endTime = System.nanoTime();
long duration = (endTime - startTime);
System.out.println("Time in nanoseconds for comparator sort: " + duration);
long startTime1 = System.nanoTime();
merge = mergeSort(merge);
long endTime1 = System.nanoTime();
long duration1 = (endTime1 - startTime1);
System.out.println("Time in nanoseconds for merge sort: " + duration1);
if(duration < duration1)
System.out.println("Standard Sort is faster");
else if(duration == duration1)
System.out.println("The sorts are the same");
else
System.out.println("Merge Sort is faster");
}
private class SquareSort implements Comparator<Square> {
@Override
public int compare(Square s1, Square s2) {
if(s1.getLocation()[0] > s2.getLocation()[0]) {
return 1;
} else if(s1.getLocation()[0] == s2.getLocation()[0]) {
if(s1.getLocation()[1] > s2.getLocation()[1]) {
return 1;
} else if(s1.getLocation()[1] == s2.getLocation()[1]) {
return 0;
} else {
return -1;
}
} else {
return -1;
}
}
}
public ArrayList<Square> mergeSort(ArrayList<Square> whole) {
ArrayList<Square> left = new ArrayList<Square>();
ArrayList<Square> right = new ArrayList<Square>();
int center;
if (whole.size() <= 1) {
return whole;
} else {
center = whole.size()/2;
for (int i = 0; i < center; i++) {
left.add(whole.get(i));
}
for (int i = center; i < whole.size(); i++) {
right.add(whole.get(i));
}
left = mergeSort(left);
right = mergeSort(right);
merge(left, right, whole);
}
return whole;
}
private void merge(ArrayList<Square> left, ArrayList<Square> right, ArrayList<Square> whole) {
int leftIndex = 0;
int rightIndex = 0;
int wholeIndex = 0;
while (leftIndex < left.size() && rightIndex < right.size()) {
if ((left.get(leftIndex).compareTo(right.get(rightIndex))) < 0) {
whole.set(wholeIndex, left.get(leftIndex));
leftIndex++;
} else {
whole.set(wholeIndex, right.get(rightIndex));
rightIndex++;
}
wholeIndex++;
}
ArrayList<Square> rest;
int restIndex;
if (leftIndex >= left.size()) {
rest = right;
restIndex = rightIndex;
} else {
rest = left;
restIndex = leftIndex;
}
for (int i = restIndex; i < rest.size(); i++) {
whole.set(wholeIndex, rest.get(i));
wholeIndex++;
}
}
private class Square {
private int[] location = new int[2];
public Square(int x, int y) {
location[0] = x;
location[1] = y;
}
public int[] getLocation() {
return location;
}
@Override
public boolean equals(Object obj) {
if(obj instanceof Square)
if(getLocation()[0] == ((Square) obj).getLocation()[0] &&
getLocation()[1] == ((Square) obj).getLocation()[1])
return true;
return false;
}
@Override
public int hashCode() {
return Objects.hash(getLocation()[0], getLocation()[1]);
}
public int compareTo(Square arg0) {
if(getLocation()[0] > arg0.getLocation()[0]) {
return 1;
} else if(getLocation()[0] == arg0.getLocation()[0]) {
if(getLocation()[1] > arg0.getLocation()[1]) {
return 1;
} else if(getLocation()[1] == arg0.getLocation()[1]) {
return 0;
} else {
return -1;
}
} else {
return -1;
}
}
}
public static void main(String[] args) {
SortSquares e = new SortSquares();
e.run();
}
}
答案 0 :(得分:2)
您可以使用jdk中的java.util.Collections.sort(List list)方法。如上所述,它使用具有复杂度O(nlogn)的合并排序。
为了衡量您的实现的性能并将其与其他实现进行比较,我建议使用jmh http://openjdk.java.net/projects/code-tools/jmh/。请在下面找一个简短的例子。
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import java.util.*;
import java.util.concurrent.TimeUnit;
@BenchmarkMode( Mode.AverageTime )
@OutputTimeUnit( TimeUnit.NANOSECONDS )
@State( Scope.Benchmark )
@Warmup( iterations = 5)
@Measurement( iterations = 5 )
@Fork( value = 1)
public class SortingPerformanceBenchmark
{
private final int[] dataArray = new int[10_000_000];
List<Integer> arrayList;
@Setup
public void load() {
Random rand = new Random();
for (int i = 0; i < dataArray.length; ++i) {
dataArray[i] = rand.nextInt();
}
}
@Benchmark
public List<Integer> Benchmark_SortObjects() {
arrayList = new ArrayList( Arrays.asList( dataArray ) );
Collections.sort( arrayList );
return arrayList;
}
public static void main(String... args) throws Exception {
Options opts = new OptionsBuilder()
.include(SortingPerformanceBenchmark.class.getSimpleName())
.build();
new Runner( opts).run();
}
}
答案 1 :(得分:1)
反之亦然:标准方法要快得多。
首先,在每次调用递归函数mergeSort时创建两个数组。标准的可能会合并原始数组中的元素,并使用索引到范围的开头和结尾。
其次,标准方法可以在多核计算机上启动新线程。
答案 2 :(得分:1)
考虑算法主要取决于数据。
据说你的排序方法很快。 你有O(n2)最坏情况运行时和O(nlogn)平均情况运行时。
Mergesort始终为O(n log n)。这意味着稳定。这就是为什么选择它来为java集合进行排序。
您实现的sort和mergesort都是相同的算法(java集合上的排序基于合并排序)。您需要多次运行相同的代码并首先预热您的jvm以获得更可靠的结果。 不知何故,您可以确保自定义mergesort有效并与集合进行比较。
在任何情况下,您都不必为简单的事情实现自己的合并排序。