Question

因此，查找列表中的最大元素需要O（n）时间复杂度（如果列表具有n个元素）。我试图实现一个看起来更快的算法。

(define (clever-max lst)
  (define (odd-half a-list)
    (cond ((null? a-list) (list))
          ((null? (cdr a-list))
           (cons (car a-list) (list)))
          (else
           (cons (car a-list)
                 (odd-half (cdr (cdr a-list)))))))
  (define (even-half a-list)
    (if (null? a-list)
        (list)
        (odd-half (cdr a-list))))
  (cond ((null? lst) (error "no elements in list!"))
        ((null? (cdr lst)) (car lst))
        (else
         (let ((l1 (even-half lst))
               (l2 (odd-half lst)))
           (max (clever-max l1) (clever-max l2))))))

这实际上更快吗？！你会说渐近时间复杂度是什么（紧束缚）？

Answer 1

给定一个你一无所知的数据列表，没有办法找到最大元素而不检查每个元素，从而花费O(n)时间，因为如果你不检查它，你可能会错过它。所以不，你的算法并不比O(n)快O(n log n)，因为你基本上只是运行合并排序。

以下是Selection problem

的更多数据

我想到了它，并意识到我应该做的事情不仅仅是说这是事实。所以我编写了一个快速测试。现在完全披露，我不是Scheme程序员，所以这是在Common Lisp中，但我认为我忠实地转换了你的算法。

;; Direct "iteration" method -- theoretical O(n)
(defun find-max-001 ( list )
  (labels ((fm ( list cur )
             (if (null list) cur
               (let ((head (car list))
                     (rest (cdr list)))
                 (fm rest (if (> head cur) head cur))))))
    (fm (cdr list) (car list))))

;; Your proposed method  
(defun find-max-002 ( list )
  (labels ((odd-half ( list )
             (cond ((null list) list)
                   ((null (cdr list)) (list (car list)))
                   (T (cons (car list) (odd-half (cddr list))))))
           (even-half ( list )
             (if (null list) list (odd-half (cdr list)))))
    (cond ((null list) list)
          ((null (cdr list)) (car list))
          (T (let ((l1 (even-half list))
                   (l2 (odd-half list)))
               (max (find-max-002 l1) (find-max-002 l2)))))))

;; Simplistic speed test              
(let ((list (loop for x from 0 to 10000 collect (random 10000))))
  (progn
    (print "Running find-max-001")
    (time (find-max-001 list))
    (print "Running find-max-002")
    (time (find-max-002 list))))

现在你可能会问你自己为什么我只使用10000作为列表大小，因为对于渐近计算来说这实际上是相当小的。事实上，sbcl认识到第一个函数是尾递归，因此将其抽象为一个循环，而它不与第二个函数一起使用，这样就可以在不杀死堆栈的情况下获得尽可能大的函数。虽然从下面的结果中可以看出，这足以说明这一点。

"Running find-max-001"
Evaluation took:
  0.000 seconds of real time
  0.000000 seconds of total run time (0.000000 user, 0.000000 system)
  100.00% CPU
  128,862 processor cycles
  0 bytes consed

"Running find-max-002"
Evaluation took:
  0.012 seconds of real time
  0.012001 seconds of total run time (0.012001 user, 0.000000 system)
  [ Run times consist of 0.008 seconds GC time, and 0.005 seconds non-GC time. ]
  100.00% CPU
  27,260,311 processor cycles
  2,138,112 bytes consed

即使在这个级别，我们也在谈论大幅放缓。在直接检查每个项目一次方法减慢到算法的10k评估之前，需要增加大约一百万个项目。

 (let ((x (loop for x from 0 to 1000000 collect (random 1000000))))
   (time (find-max-001 x)))

Evaluation took:
  0.007 seconds of real time
  0.008000 seconds of total run time (0.008000 user, 0.000000 system)
  114.29% CPU
  16,817,949 processor cycles
  0 bytes consed

最后的想法和结论

所以必须要问的下一个问题是为什么第二个算法确实需要更长的时间。没有详细介绍tail recursion elimination，有一些事情真的会跳出来。

第一个是cons。现在是的，cons是O(1)，但它仍然是系统的另一个操作。它需要系统分配和释放内存（必须启动垃圾收集器）。真正跳出来的第二件事是你基本上运行一个合并排序，除了抓住列表的上半部分和上半部分你抓住偶数和奇数节点（这也需要更长时间，因为你必须迭代每一个时间来建立清单）。你所拥有的是最好的O(n log n)算法（请注意，它的合并排序非常适合排序），但它会带来很多额外的开销。

更快的方案功能？

1 个答案: