我在haskell中读过,在对迭代器进行排序时,它只根据需要计算qsort的大小,以返回在生成的迭代器上实际计算的值的数量(即,它是懒惰的,即,一旦它有完成了第一个数据透视表的LHS并且可以返回一个值,它可以在迭代器上的“next”调用中提供一个值,并且不会继续旋转,除非再次调用next。)
例如,在haskell中,head(qsort list)是O(n)。它只是在列表中找到最小值,并且不会对列表的其余部分进行排序,除非访问qsort list
的其余结果。
有没有办法在Scala中执行此操作?我想在集合上使用sortWith,但只能根据需要进行排序,这样我就可以使用mySeq.sortWith(< )。take(3)并且不需要完成排序操作。
我想知道是否可以以懒惰的方式使用其他排序函数(如sortBy),以及如何确保懒惰,以及如何查找有关何时对Scala中的排序进行懒惰评估的任何其他文档。
UPDATE / EDIT :我正在寻找使用sortWith等标准排序功能的方法。我宁愿不必实现我自己的quicksort版本只是为了得到懒惰的评价。不应该将它构建到标准库中,至少对于像Stream这样的集合来支持懒惰吗?
答案 0 :(得分:8)
我使用Scala的priority queue implementation来解决这种部分排序问题:
import scala.collection.mutable.PriorityQueue
val q = PriorityQueue(1289, 12, 123, 894, 1)(Ordering.Int.reverse)
现在我们可以致电dequeue
:
scala> q.dequeue
res0: Int = 1
scala> q.dequeue
res1: Int = 12
scala> q.dequeue
res2: Int = 123
构建队列需要O(n)
,而O(k log n)
元素需要k
。
不幸的是PriorityQueue
没有按优先级顺序迭代,但编写一个调用dequeue
的迭代器并不太难。
答案 1 :(得分:1)
作为一个例子,我创建了一个惰性快速排序的实现,它创建了一个惰性树结构(而不是生成结果列表)。可以在i
时间或O(n)
元素片段中询问此结构中的任何k
元素。再次询问相同的元素(或附近的元素)仅需O(log n)
,因为重复使用在上一步骤中构建的树结构。遍历所有元素需要O(n log n)
时间。 (假设我们选择了合理的支点。)
关键是子树不是立即构建的,它们在延迟计算中被延迟。因此,当只询问单个元素时,根节点在O(n)
中计算,然后在O(n/2)
等中的一个子节点计算,直到找到所需的元素,取O(n + n/2 + n/4 ...) = O(n)
。完全评估树时,选择任何元素都需要O(log n)
和任何平衡树一样。
请注意,build
的实施效率很低。我希望它尽可能简单易懂。重要的是它具有适当的渐近界限。
import collection.immutable.Traversable
object LazyQSort {
/**
* Represents a value that is evaluated at most once.
*/
final protected class Thunk[+A](init: => A) extends Function0[A] {
override lazy val apply: A = init;
}
implicit protected def toThunk[A](v: => A): Thunk[A] = new Thunk(v);
implicit protected def fromThunk[A](t: Thunk[A]): A = t.apply;
// -----------------------------------------------------------------
/**
* A lazy binary tree that keeps a list of sorted elements.
* Subtrees are created lazily using `Thunk`s, so only
* the necessary part of the whole tree is created for
* each operation.
*
* Most notably, accessing any i-th element using `apply`
* takes O(n) time and traversing all the elements
* takes O(n * log n) time.
*/
sealed abstract class Tree[+A]
extends Function1[Int,A] with Traversable[A]
{
override def apply(i: Int) = findNth(this, i);
override def head: A = apply(0);
override def last: A = apply(size - 1);
def max: A = last;
def min: A = head;
override def slice(from: Int, until: Int): Traversable[A] =
LazyQSort.slice(this, from, until);
// We could implement more Traversable's methods here ...
}
final protected case class Node[+A](
pivot: A, leftSize: Int, override val size: Int,
left: Thunk[Tree[A]], right: Thunk[Tree[A]]
) extends Tree[A]
{
override def foreach[U](f: A => U): Unit = {
left.foreach(f);
f(pivot);
right.foreach(f);
}
override def isEmpty: Boolean = false;
}
final protected case object Leaf extends Tree[Nothing] {
override def foreach[U](f: Nothing => U): Unit = {}
override def size: Int = 0;
override def isEmpty: Boolean = true;
}
// -----------------------------------------------------------------
/**
* Finds i-th element of the tree.
*/
@annotation.tailrec
protected def findNth[A](tree: Tree[A], n: Int): A =
tree match {
case Leaf => throw new ArrayIndexOutOfBoundsException(n);
case Node(pivot, lsize, _, l, r)
=> if (n == lsize) pivot
else if (n < lsize) findNth(l, n)
else findNth(r, n - lsize - 1);
}
/**
* Cuts a given subinterval from the data.
*/
def slice[A](tree: Tree[A], from: Int, until: Int): Traversable[A] =
tree match {
case Leaf => Leaf
case Node(pivot, lsize, size, l, r) => {
lazy val sl = slice(l, from, until);
lazy val sr = slice(r, from - lsize - 1, until - lsize - 1);
if ((until <= 0) || (from >= size)) Leaf // empty
if (until <= lsize) sl
else if (from > lsize) sr
else sl ++ Seq(pivot) ++ sr
}
}
// -----------------------------------------------------------------
/**
* Builds a tree from a given sequence of data.
*/
def build[A](data: Seq[A])(implicit ord: Ordering[A]): Tree[A] =
if (data.isEmpty) Leaf
else {
// selecting a pivot is traditionally a complex matter,
// for simplicity we take the middle element here
val pivotIdx = data.size / 2;
val pivot = data(pivotIdx);
// this is far from perfect, but still linear
val (l, r) = data.patch(pivotIdx, Seq.empty, 1).partition(ord.lteq(_, pivot));
Node(pivot, l.size, data.size, { build(l) }, { build(r) });
}
}
// ###################################################################
/**
* Tests some operations and prints results to stdout.
*/
object LazyQSortTest extends App {
import util.Random
import LazyQSort._
def trace[A](name: String, comp: => A): A = {
val start = System.currentTimeMillis();
val r: A = comp;
val end = System.currentTimeMillis();
println("-- " + name + " took " + (end - start) + "ms");
return r;
}
{
val n = 1000000;
val rnd = Random.shuffle(0 until n);
val tree = build(rnd);
trace("1st element", println(tree.head));
// Second element is much faster since most of the required
// structure is already built
trace("2nd element", println(tree(1)));
trace("Last element", println(tree.last));
trace("Median element", println(tree(n / 2)));
trace("Median + 1 element", println(tree(n / 2 + 1)));
trace("Some slice", for(i <- tree.slice(n/2, n/2+30)) println(i));
trace("Traversing all elements", for(i <- tree) i);
trace("Traversing all elements again", for(i <- tree) i);
}
}
输出类似于
0
-- 1st element took 268ms
1
-- 2nd element took 0ms
999999
-- Last element took 39ms
500000
-- Median element took 122ms
500001
-- Median + 1 element took 0ms
500000
...
500029
-- Slice took 6ms
-- Traversing all elements took 7904ms
-- Traversing all elements again took 191ms
答案 2 :(得分:0)
您可以使用Stream
来构建类似的东西。这是一个简单的例子,绝对可以做得更好,但我想它就是一个例子。
def extractMin(xs: List[Int]) = {
def extractMin(xs: List[Int], min: Int, rest: List[Int]): (Int,List[Int]) = xs match {
case Nil => (min, rest)
case head :: tail if head > min => extractMin(tail, min, head :: rest)
case head :: tail => extractMin(tail, head, min :: rest)
}
if(xs.isEmpty) throw new NoSuchElementException("List is empty")
else extractMin(xs.tail, xs.head, Nil)
}
def lazySort(xs: List[Int]): Stream[Int] = xs match {
case Nil => Stream.empty
case _ =>
val (min, rest) = extractMin(xs)
min #:: lazySort(rest)
}