Question

我正在环顾stackoverflow Non-Trivial Lazy Evaluation，这让我想到了Keegan McAllister的演讲：Why learn Haskell。在幻灯片8中，他显示了最小函数，定义为：

minimum = head . sort

并声明其复杂性为O（n）。我不明白为什么如果通过替换排序是O（nlog n），复杂性被认为是线性的。帖子中引用的排序不能是线性的，因为它不假设数据，因为线性排序方法需要它，例如计数排序。

懒惰的评价在这里发挥着神秘的作用吗？如果是这样，背后的解释是什么？

Answer 1

在minimum = head . sort中，sort将无法完全完成，因为它不会提前完成。 sort只会根据需要尽可能多地生成head所要求的第一个元素。

例如mergesort，在列表的最初n个数字将成对比较，然后获胜者将配对并比较（n/2个数字），然后是新的获胜者（n/4）等。总之，O(n)比较产生最小元素。

mergesortBy less [] = []
mergesortBy less xs = head $ until (null.tail) pairs [[x] | x <- xs]
  where
    pairs (x:y:t) = merge x y : pairs t
    pairs xs      = xs
    merge (x:xs) (y:ys) | less y x  = y : merge (x:xs) ys
                        | otherwise = x : merge  xs (y:ys)
    merge  xs     []                = xs
    merge  []     ys                = ys

上面的代码可以进行扩充，以标记它产生的每个数字，并进行大量的比较：

mgsort xs = go $ map ((,) 0) xs  where
  go [] = []
  go xs = head $ until (null.tail) pairs [[x] | x <- xs]   where
    ....
    merge ((a,b):xs) ((c,d):ys) 
            | (d < b)   = (a+c+1,d) : merge ((a+1,b):xs) ys    -- cumulative
            | otherwise = (a+c+1,b) : merge  xs ((c+1,d):ys)   --   cost
    ....

g n = concat [[a,b] | (a,b) <- zip [1,3..n] [n,n-2..1]]   -- a little scrambler

运行多个列表长度，我们看到 确实是~ n ：

*Main> map (fst . head . mgsort . g) [10, 20, 40, 80, 160, 1600]
[9,19,39,79,159,1599]

要查看排序代码本身是否为~ n log n，我们会对其进行更改，以便每个生成的数字仅包含其自身的成本，然后通过对整个排序列表求和来找到总成本：

    merge ((a,b):xs) ((c,d):ys) 
            | (d < b)   = (c+1,d) : merge ((a+1,b):xs) ys      -- individual
            | otherwise = (a+1,b) : merge  xs ((c+1,d):ys)     --   cost

以下是各种长度列表的结果

*Main> let xs = map (sum . map fst . mgsort . g) [20, 40, 80, 160, 320, 640]
[138,342,810,1866,4218,9402]

*Main> map (logBase 2) $ zipWith (/) (tail xs) xs
[1.309328,1.2439256,1.2039552,1.1766101,1.1564085]

以上显示empirical orders of growth用于增加列表长度n，这些列表正在迅速减少，通常由 ~ n log n 计算表现出来。另见this blog post。这是一个快速的相关性检查：

*Main> let xs = [n*log n | n<- [20, 40, 80, 160, 320, 640]] in 
                                    map (logBase 2) $ zipWith (/) (tail xs) xs
[1.3002739,1.2484156,1.211859,1.1846942,1.1637106]

编辑：懒惰评估可以隐喻地被视为生产者/消费者习语¹，其中独立的memoizing存储作为中介。我们编写的任何有效的定义都定义了一个生产者，它会在消费者要求的时候一点一点地产生它的输出 - 但不是更快。无论生成什么都是备忘的，因此如果另一个消费者以不同的速度消耗相同的输出，它将访问先前填充的相同存储。

当没有更多的消费者继续参考存储时，它会被垃圾收集。有时通过优化，编译器可以完全取消中间存储，从而减少中间人。

¹另见：Simple Generators v. Lazy Evaluation由Oleg Kiselyov，Simon Peyton-Jones和Amr Sabry撰写。

Answer 2

假设minimum' :: (Ord a) => [a] -> (a, [a])是一个函数，它返回列表中的最小元素以及删除了该元素的列表。显然，这可以在O（n）时间内完成。然后，如果您将sort定义为

sort :: (Ord a) => [a] -> [a]
sort xs = xmin:(sort xs')
    where
      (xmin, xs') = minimum' xs

然后延迟评估意味着在(head . sort) xs中只计算第一个元素。正如您所见，这个元素只是（{1}}对的第一个元素，它是在O（n）时间内计算的。

当然，正如德尔南指出的那样，复杂性取决于minimum' xs的实施。

Answer 3

你已经得到了很多解决head . sort细节问题的答案。我将添加一些更一般的声明。

通过热切的评估，各种算法的计算复杂性以简单的方式构成。例如，f . g的最小上限（LUB）必须是f和g的LUB之和。因此，您可以将f和g视为黑匣子，并仅根据其LUB进行推理。

但是，通过延迟评估，f . g可以使LUB优于f和g的LUB之和。你不能用黑盒推理来证明LUB;你必须分析实现及其相互作用。

因此，经常提到的事实是，懒惰评估的复杂性比急切评估更难以推理。试想以下几点。假设您正在尝试改进形式为f . g的代码的渐近性能。在一种热切的语言中，您可以遵循明显的策略来执行此操作：选择更复杂的f和g，并首先改进它。如果您成功，那么您将成功完成f . g任务。

另一方面，在懒惰的语言中，您可以遇到以下情况：

您改进了f和g的更复杂，但f . g没有改善（甚至更糟）。
您可以通过无法帮助（甚至恶化）f . g或f的方式改进g。

Answer 4

解释取决于sort的实现，对于某些实现，它不是真的。例如，插入排序插入列表的末尾，延迟评估没有帮助。因此，我们选择要查看的实现，为简单起见，我们使用选择排序：

sort [] = []
sort (x:xs) = m : sort (delete m (x:xs)) 
  where m = foldl (\x y -> if x < y then x else y) x xs

该函数明确使用O（n ^ 2）时间对列表进行排序，但由于head只需要列表的第一个元素，因此sort (delete x xs)永远不会被评估！

Answer 5

这不是那么神秘。您需要多少列表来排序以提供第一个元素？您需要找到最小元素，这可以在线性时间内轻松完成。碰巧，对于sort懒惰评估的某些实现，将为您执行此操作。

Answer 6

在实践中看到这一点的一个有趣方式是跟踪比较函数。

import Debug.Trace
import Data.List

myCmp x y = trace (" myCmp " ++ show x ++ " " ++ show y) $ compare x y

xs = [5,8,1,3,0,54,2,5,2,98,7]

main = do
    print "Sorting entire list"
    print $ sortBy myCmp xs

    print "Head of sorted list"
    print $ head $ sortBy myCmp xs

首先，请注意整个列表的输出与跟踪消息交错的方式。其次，注意仅在计算头部时跟踪消息是如何相似的。

我刚刚通过Ghci运行它，它不完全是O（n）：它需要15次比较来找到第一个元素，而不是应该需要的10个元素。但它仍然小于O（n log n）。

编辑：正如Vitus在下面指出的那样，进行15次比较而不是10次与说不是O（n）不同。我的意思是它需要超过理论上的最小值。

Answer 7

受Paul Johnson的回答启发，我绘制了这两个功能的增长率。首先，我修改了他的代码，每次比较打印一个字符：

import System.Random
import Debug.Trace
import Data.List
import System.Environment

rs n = do
    gen <- newStdGen
    let ns = randoms gen :: [Int]
    return $ take n ns

cmp1 x y = trace "*" $ compare x y
cmp2 x y = trace "#" $ compare x y

main = do
    n <- fmap (read . (!!0)) getArgs
    xs <- rs n
    print "Sorting entire list"
    print $ sortBy cmp1 xs

    print "Head of sorted list"
    print $ head $ sortBy cmp2 xs

计算*和#个字符，我们可以在均匀间隔的点处对比较计数进行抽样（请原谅我的python）：

import matplotlib.pyplot as plt
import numpy as np
import envoy

res = []
x = range(10,500000,10000)
for i in x:
    r = envoy.run('./sortCount %i' % i)
    res.append((r.std_err.count('*'), r.std_err.count('#')))

plt.plot(x, map(lambda x:x[0], res), label="sort")
plt.plot(x, map(lambda x:x[1], res), label="minimum")
plt.plot(x, x*np.log2(x), label="n*log(n)")
plt.plot(x, x, label="n")
plt.legend()
plt.show()

运行脚本会给我们提供以下图表：

growth rates

下线的斜率为..

>>> import numpy as np
>>> np.polyfit(x, map(lambda x:x[1], res), deg=1)
array([  1.41324057, -17.7512292 ])

.. 1.41324057（假设它是线性函数）

懒惰的评估和时间复杂性

7 个答案: