Lisp中的数组与列表:为什么下面的代码中的列表速度要快得多?

时间:2016-07-20 23:42:04

标签: math common-lisp discrete-mathematics pythagorean

我在解决Problem 75 in Project Euler时得到了意想不到的结果。我的代码确实找到了正确的解决方案,但它表现得很奇怪。

我的解决方案包括遍历毕达哥拉斯树(Barning's matrices),直到达到周界限制,计算周长假设每个值的次数,最后计算仅发生一次的周长。我公认的不整洁但功能正常的代码是:

(defparameter *barning-matrixes*
   '(#(1 -2  2) #(2 -1  2) #(2 -2  3)
     #(1  2  2) #(2  1  2) #(2  2  3)
     #(-1 2  2) #(-2 1  2) #(-2 2  3)))

(defparameter *lengths* (make-array 1500001 :initial-element 0))

(defun expand-node (n)
"Takes a primitive Pythagorean triple in a vector and traverses subsequent nodes in the the tree of primitives until perimeter > 1,500,000"
   (let ((perimeter (reduce #'+ n)))
   (unless (> perimeter 1500000)
     (let ((next-nodes (mapcar #'(lambda (x)
                                   (reduce #'+ (map 'vector #'* n x))) *barning-matrixes*)))
        (loop for i from perimeter to 1500000 by perimeter
              do (incf (aref *lengths* i)))
        (expand-node (subseq next-nodes 0 3))
        (expand-node (subseq next-nodes 3 6))
        (expand-node (subseq next-nodes 6 9))))))

(expand-node #(3 4 5))  ; Takes too darn long :-(
(count 1 *lengths*)

我预计树扩展会在几毫秒内运行,但扩展节点功能需要8.65秒 - 比预期的要多得多 - 来遍历一棵不太大的树。

然而,当我调整代码以移除向量时,我感到很惊讶......

(defparameter *barning-matrixes*
   '((1 -2  2) (2 -1  2) (2 -2  3)
     (1  2  2) (2  1  2) (2  2  3)
     (-1 2  2) (-2 1  2) (-2 2  3)))


(defparameter *lengths* (make-array 1500001 :initial-element 0))

(defun expand-node (n)
"Takes a primitive Pythagorean triple in a list and traverses subsequent nodes in the the tree of primitives until perimeter > 1,500,000"
   (let ((perimeter (reduce #'+ n)))
   (unless (> perimeter 1500000)
     (let ((next-nodes (mapcar #'(lambda (x) (reduce #'+ (mapcar #'* n x))) *barning-matrixes*)))
        (loop for i from perimeter to 1500000 by perimeter
              do (incf (aref *lengths* i)))
        (expand-node (subseq next-nodes 0 3))
        (expand-node (subseq next-nodes 3 6))
        (expand-node (subseq next-nodes 6 9))))))

(expand-node '(3 4 5))  ; Much faster, but why?!
(count 1 *lengths*)

......并且移动速度非常快,只需35毫秒。我对这种巨大的差异很感兴趣,并希望有人可以解释为什么会发生这种情况。

谢谢,   圣保罗

PS:我正在使用CCL。

4 个答案:

答案 0 :(得分:3)

您没有说明您正在使用哪种实施方式。

你需要找出时间花在哪里。

但对我来说,看起来列表的MAP的实现和与Common Lisp中的新向量相等的向量可能效率非常低。 即使在使用具有一些开销的新向量时,实现也会快得多。

尝试将向量操作实现为LOOP并进行比较:

(loop with v = (make-array (length n))
      for n1 across n
      for x1 across x
      for i from 0
      do (setf (aref v i) (* n1 x1))
      finally (return v))

这个更快的版本也适用,但已经用向量操作替换了列表操作:

(defparameter *barning-matrixes*
  #(#(1 -2  2) #(2 -1  2) #(2 -2  3) #(1  2  2) #(2  1  2) #(2  2  3) #(-1 2  2) #(-2 1  2) #(-2 2  3)))

(defparameter *lengths* (make-array 1500001 :initial-element 0))

(defun expand-node (n)
  "Takes a primitive Pythagorean triple in a vector and traverses subsequent nodes in the the tree of primitives until perimeter > 1,500,000"
  (let ((perimeter (reduce #'+ n)))
    (unless (> perimeter 1500000)
      (let ((next-nodes
             (loop with v = (make-array (length *barning-matrixes*))
                   for e across *barning-matrixes*
                   for i from 0
                   do (setf (aref v i)
                            (reduce #'+
                                    (loop with v = (make-array (length n))
                                          for n1 across n
                                          for x1 across e
                                          for i from 0
                                          do (setf (aref v i) (* n1 x1))
                                          finally (return v))))
                   finally (return v))))
        (loop for i from perimeter to 1500000 by perimeter
              do (incf (aref *lengths* i)))
        (expand-node (subseq next-nodes 0 3))
        (expand-node (subseq next-nodes 3 6))
        (expand-node (subseq next-nodes 6 9))))))

(time (expand-node #(3 4 5)))

让我们来看看你的代码:

(defun expand-node (n)

; here we don't know of which type N is. You call it from the toplevel
; with a vector, but recursive calls call it with a list

  "Takes a primitive Pythagorean triple in a vector and traverses
 subsequent nodes in the the tree of primitives until perimeter > 1,500,000"
  (let ((perimeter (reduce #'+ n)))
    (unless (> perimeter 1500000)
      (let ((next-nodes (mapcar #'(lambda (x)    ; this mapcar creates a list
                                    (reduce #'+
                                            (map 'vector
                                                 #'*
                                                 n  ; <- list or vector
                                                 x))) ; <- vector
                                *barning-matrixes*)))
        (loop for i from perimeter to 1500000 by perimeter
              do (incf (aref *lengths* i)))
        (expand-node (subseq next-nodes 0 3))   ; this subseq returns a list most of the times...
        (expand-node (subseq next-nodes 3 6))
        (expand-node (subseq next-nodes 6 9))))))

所以你大多数时候用一个列表和一个向量来调用MAP。 结果向量的大小是多少? MAP必须通过遍历列表或通过其他方式找出。结果向量长度是参数序列长度中最短的。然后它必须迭代列表和向量。如果MAP现在使用通用序列操作,则对列表的元素访问始终遍历列表。显然,可以编写一个优化版本,它可以更快地完成所有操作,但Common Lisp实现可能会选择仅提供MAP的通用实现...

答案 1 :(得分:3)

欢迎来到Common Lisp优化的复杂性! 首先要注意的是不同实现所执行的不同程序优化策略:我在SBCL中尝试了您的示例,并且它们在几乎相同的时间内执行效率非常高,而在CCL中,矢量版本的执行速度比列表版本。我不知道您尝试过哪种实现,但您可以尝试使用不同的实现来查看非常不同的执行时间。

在CCL的一些测试中,在我看来,主要问题来自于这种形式:

(map 'vector #'* n x)

执行速度比相应的列表版本慢得多:

(mapcar #'* n x)

使用time我已经看到矢量版本很多了。

只需使用辅助向量更改mapmap-into即可确认第一印象。实际上,CCL中的以下版本比列表版本稍快一些:

(defun expand-node (n)
"Takes a primitive Pythagorean triple in a vector and traverses subsequent nodes in the the tree of primitives until perimeter > 1,500,000"
   (let ((perimeter (reduce #'+ n))
         (temp-vector (make-array 3 :initial-element 0)))
     (unless (> perimeter 1500000)
       (let ((next-nodes (mapcar #'(lambda (x)
                                   (reduce #'+ (map-into temp-vector #'* n x))) *barning-matrixes*)))
         (loop for i from perimeter to 1500000 by perimeter
               do (incf (aref *lengths* i)))
         (expand-node (subseq next-nodes 0 3))
         (expand-node (subseq next-nodes 3 6))
         (expand-node (subseq next-nodes 6 9))))))

答案 2 :(得分:2)

在SBCL上检查矢量#(1 2 3)给出:

Dimensions: (3)
Element type: T
Total size: 3
Adjustable: NIL
Fill pointer: NIL
Contents:
0: 1
1: 2
2: 3

您可以看到存储的数据多于列表中的数据,即使实现的确切内部表示因实现而异。对于像示例一样继续复制的小向量,最终可能会分配比列表更多的内存,这在下面的字节consed 行中可见。分配内存有助于运行时间。在我的测试中,请注意时间差异不如测试中那么大。

;; VECTORS
(time (expand-node #(3 4 5)))
;; Evaluation took:
;;   2.060 seconds of real time
;;   2.062500 seconds of total run time (1.765625 user, 0.296875 system)
;;   [ Run times consist of 0.186 seconds GC time, and 1.877 seconds non-GC time. ]
;;   100.10% CPU
;;   4,903,137,055 processor cycles
;;   202,276,032 bytes consed

;; LISTS
(time (expand-node* '(3 4 5)))
;; Evaluation took:
;;   0.610 seconds of real time
;;   0.609375 seconds of total run time (0.609375 user, 0.000000 system)
;;   [ Run times consist of 0.016 seconds GC time, and 0.594 seconds non-GC time. ]
;;   99.84% CPU
;;   1,432,603,387 processor cycles
;;   80,902,560 bytes consed

答案 3 :(得分:2)

在我尝试优化代码时,每个人都已经回答过了,所以我只是把这个版本放在这里,而不必费心解释太多。它应该运行得非常快,至少在SBCL上。

(declaim (optimize (speed 3) (safety 0) (debug 0)))

(declaim (type (simple-array (simple-array fixnum 1) 1) *barning-matrixes*))
(defparameter *barning-matrixes*
  (map '(simple-array (simple-array fixnum 1) 1)
       (lambda (list)
         (make-array 3 :element-type 'fixnum
                       :initial-contents list))
       '((1 -2 2) (2 -1 2) (2 -2 3)
         (1 2 2) (2 1 2) (2 2 3)
         (-1 2 2) (-2 1 2) (-2 2 3))))

(declaim (type (simple-array fixnum 1) *lengths*))
(defparameter *lengths* (make-array 1500001 :element-type 'fixnum
                                            :initial-element 0))

(declaim (ftype (function ((simple-array fixnum 1))) expand-node))
(defun expand-node (n)
  "Takes a primitive Pythagorean triple in a vector and traverses subsequent nodes in the the tree of primitives until perimeter > 1,500,000"
  (loop with list-of-ns = (list n)
        for n = (pop list-of-ns)
        while n
        do (let ((perimeter (let ((result 0))
                              (declare (type fixnum result))
                              (dotimes (i (length n) result)
                                (incf result (aref n i))))))
             (declare (type fixnum perimeter))
             (unless (> perimeter 1500000)
               (let ((next-nodes
                       (let ((result (list)))
                         (dotimes (matrix 9 (nreverse result))
                           (let ((matrix (aref *barning-matrixes* matrix)))
                             (push (let ((result 0))
                                     (declare (type fixnum result))
                                     (dotimes (i 3 result)
                                       (incf result
                                             (the fixnum
                                                  (* (the fixnum (aref matrix i))
                                                     (the fixnum (aref n i)))))))
                                   result))))))
                 (declare (type list next-nodes))
                 (loop for i from perimeter to 1500000 by perimeter
                       do (incf (aref *lengths* i)))
                 (dotimes (i 3)
                   (push (make-array 3 :element-type 'fixnum
                                       :initial-contents (list (pop next-nodes)
                                                               (pop next-nodes)
                                                               (pop next-nodes)))
                         list-of-ns))))))
  (values))

在我的慢速笔记本电脑上,

CL-USER> (load (compile-file #P"e75.lisp"))
; ...compilation notes...
CL-USER> (time (expand-node (make-array 3 :element-type 'fixnum
                                          :initial-contents '(3 4 5))))
Evaluation took:
  0.274 seconds of real time
  0.264000 seconds of total run time (0.264000 user, 0.000000 system)
  96.35% CPU
  382,768,596 processor cycles
  35,413,600 bytes consed

; No values
CL-USER> (count 1 *lengths*)
161667 (18 bits, #x27783)

原始代码使用向量运行大约1.8秒,使用列表运行0.8秒。