首先,它不是关于在我们开始排序之前可能按某种顺序排列的子序列的数组,它是关于特殊结构的数组。
我现在正在编写一种对数据进行排序的简单方法。到目前为止,我使用了Array.Sort
,但PLINQ
' s OrderBy
在大型数组上的表现优于标准Array.Sort
。
所以我决定编写自己的多线程排序实现。想法很简单:在分区上拆分数组,并行排序每个分区,然后将所有结果合并到一个数组中。
现在我完成了分区和排序:
public class PartitionSorter
{
public static void Sort(int[] arr)
{
var ranges = Range.FromArray(arr);
var allDone = new ManualResetEventSlim(false, ranges.Length*2);
int completed = 0;
foreach (var range in ranges)
{
ThreadPool.QueueUserWorkItem(r =>
{
var rr = (Range) r;
Array.Sort(arr, rr.StartIndex, rr.Length);
if (Interlocked.Increment(ref completed) == ranges.Length)
allDone.Set();
}, range);
}
allDone.Wait();
}
}
public class Range
{
public int StartIndex { get; }
public int Length { get; }
public Range(int startIndex, int endIndex)
{
StartIndex = startIndex;
Length = endIndex;
}
public static Range[] FromArray<T>(T[] source)
{
int processorCount = Environment.ProcessorCount;
int partitionLength = (int) (source.Length/(double) processorCount);
var result = new Range[processorCount];
int start = 0;
for (int i = 0; i < result.Length - 1; i++)
{
result[i] = new Range(start, partitionLength);
start += partitionLength;
}
result[result.Length - 1] = new Range(start, source.Length - start);
return result;
}
}
结果我得到一个具有特殊结构的数组,例如
[1 3 5 | 2 4 7 | 6 8 9]
现在我该如何使用这些信息并完成排序?插入排序和其他人没有使用块中的数据已经排序的信息,我们只需要将它们合并在一起。我尝试应用Merge sort
中的一些算法,但失败了。
答案 0 :(得分:2)
我已经使用并行Quicksort实现进行了一些测试。
我在Windows x64 10上使用RELEASE构建测试了以下代码,使用C#6(Visual Studio 2015),。Net 4.61编译,并在任何调试器外部运行。
我的处理器是具有超线程的四核(这当然有助于任何并行实现!)
数组大小为20,000,000(因此数组相当大)。
我得到了这些结果:
LINQ OrderBy() took 00:00:14.1328090
PLINQ OrderBy() took 00:00:04.4484305
Array.Sort() took 00:00:02.3695607
Sequential took 00:00:02.7274400
Parallel took 00:00:00.7874578
PLINQ OrderBy()
比LINQ OrderBy()
快得多,但比Array.Sort()
慢。
QuicksortSequential()与Array.Sort()
但有趣的是,QuicksortParallelOptimised()
在我的系统上明显更快 - 所以如果你有足够的处理器核心,它绝对是一种有效的排序方式。
这是完整的可编辑控制台应用程序。请记住在RELEASE模式下运行它 - 如果在DEBUG模式下运行它,时序结果将非常不正确。
using System;
using System.Diagnostics;
using System.Linq;
using System.Threading.Tasks;
namespace Demo
{
class Program
{
static void Main()
{
int n = 20000000;
int[] a = new int[n];
var rng = new Random(937525);
for (int i = 0; i < n; ++i)
a[i] = rng.Next();
var b = a.ToArray();
var d = a.ToArray();
var sw = new Stopwatch();
sw.Restart();
var c = a.OrderBy(x => x).ToArray(); // Need ToArray(), otherwise it does nothing.
Console.WriteLine("LINQ OrderBy() took " + sw.Elapsed);
sw.Restart();
var e = a.AsParallel().OrderBy(x => x).ToArray(); // Need ToArray(), otherwise it does nothing.
Console.WriteLine("PLINQ OrderBy() took " + sw.Elapsed);
sw.Restart();
Array.Sort(d);
Console.WriteLine("Array.Sort() took " + sw.Elapsed);
sw.Restart();
QuicksortSequential(a, 0, a.Length-1);
Console.WriteLine("Sequential took " + sw.Elapsed);
sw.Restart();
QuicksortParallelOptimised(b, 0, b.Length-1);
Console.WriteLine("Parallel took " + sw.Elapsed);
// Verify that our sort implementation is actually correct!
Trace.Assert(a.SequenceEqual(c));
Trace.Assert(b.SequenceEqual(c));
}
static void QuicksortSequential<T>(T[] arr, int left, int right)
where T : IComparable<T>
{
if (right > left)
{
int pivot = Partition(arr, left, right);
QuicksortSequential(arr, left, pivot - 1);
QuicksortSequential(arr, pivot + 1, right);
}
}
static void QuicksortParallelOptimised<T>(T[] arr, int left, int right)
where T : IComparable<T>
{
const int SEQUENTIAL_THRESHOLD = 2048;
if (right > left)
{
if (right - left < SEQUENTIAL_THRESHOLD)
{
QuicksortSequential(arr, left, right);
}
else
{
int pivot = Partition(arr, left, right);
Parallel.Invoke(
() => QuicksortParallelOptimised(arr, left, pivot - 1),
() => QuicksortParallelOptimised(arr, pivot + 1, right));
}
}
}
static int Partition<T>(T[] arr, int low, int high) where T : IComparable<T>
{
int pivotPos = (high + low) / 2;
T pivot = arr[pivotPos];
Swap(arr, low, pivotPos);
int left = low;
for (int i = low + 1; i <= high; i++)
{
if (arr[i].CompareTo(pivot) < 0)
{
left++;
Swap(arr, i, left);
}
}
Swap(arr, low, left);
return left;
}
static void Swap<T>(T[] arr, int i, int j)
{
T tmp = arr[i];
arr[i] = arr[j];
arr[j] = tmp;
}
}
}