测试SBCL `delete-duplicates`实施：

Question

我正在寻找像#'delete-duplicates这样的东西，但我知道列表中的所有元素都已经排序，或者反向排序，或至少排列，以便重复项已经相互相邻。我希望利用这些知识来确保执行速度不符合列表中元素数量的平方。使用#'maplist来发展我自己的解决方案是微不足道的，但语言中是否有？重新发明轮子会很尴尬。

要明确的是，对于较长的列表，我希望删除的运行时间与列表的长度成比例，而不是与该长度的平方成正比。这是我希望避免的行为：

1 (defun one-shot (cardinality) 2 (labels ((generate-list (the-count) 3 (let* ((the-list (make-list the-count))) 4 (do ((iterator 0 (1+ iterator))) 5 ((>= iterator the-count)) 6 (setf (nth iterator the-list) iterator)) 7 the-list))) 8 (let* ((given-list (generate-list cardinality)) 9 (stripped-list) 10 (start-time) 11 (end-time)) 12 (setf start-time (get-universal-time)) 13 (setf stripped-list (delete-duplicates given-list :test #'eql)) 14 (setf end-time (get-universal-time)) 15 (princ "for n = ") 16 (princ cardinality) 17 (princ ", #'delete-duplicates took ") 18 (princ (- end-time start-time)) 19 (princ " seconds") 20 (terpri)))) 21 (one-shot 20000) 22 (one-shot 40000) 23 (one-shot 80000) for n = 20000, #'delete-duplicates took 6 seconds for n = 40000, #'delete-duplicates took 24 seconds for n = 80000, #'delete-duplicates took 95 seconds

Answer 1

语言中没有这样的东西，但是这样的东西只能通过列表：

(defun delete-adjacent-duplicates (list &key key (test 'eql))
  (loop
     for head = list then (cdr head)
     until (endp head)
     finally (return list)
     do (setf (cdr head)
              (member (if (null key) (car head)
                          (funcall key (car head)))
                      (cdr head)
                      :key key :test-not test))))

作为@wvxvw pointed out，可以使用(loop for head on list finally (return list) do ...)简化此迭代。但是，3.6 Traversal Rules and Side Effects表示在对象遍历期间修改列表的cdr链会导致未定义的行为。但是，不清楚loop for head on list在技术上是否是对象遍历操作。关于循环的文档在6.1.2.1.3 The for-as-on-list subclause中说明

在for-as-on-list子句中，for或as构造迭代一个列表。 ...... 变量var绑定到form1中列表的连续尾部。在每次迭代结束时，将函数step-fun应用于列表; step-fun的默认值是cdr。 ... for或as构造到达列表末尾时终止。

这表示步骤函数总是在迭代结束时应用，所以听起来loop for head on list应该没问题。无论如何，使用do循环可以避免任何可能的问题：

(defun delete-adjacent-duplicates (list &key key (test 'eql))
  (do ((head list (cdr head)))
      ((endp head) list)
    (setf (cdr head)
          (member (if (null key) (car head)
                      (funcall key (car head)))
                  (cdr head)
                  :key key :test-not test))))

我们的想法是从head作为列表开始，然后将cdr设置为以不同元素开头的第一个尾部，然后推进头部，并继续直到没有任何剩余。假设member以合理的方式实现，这应该是列表长度的线性。 member的使用意味着您无需执行任何额外工作即可使:key和:test以适当的方式工作。（请注意，:test的{{1}}将成为del-dups的{{1}}。）注意：实际上这有一个小问题，因为{ {1}}函数会为最终列表中的每个元素调用两次：一次是尾部的第一个元素，一次是:test-not的{{1}}。

member

我希望任何线性时间解决方案都采用类似的方法;保持对当前头部的引用，找到以不同元素开头的下一个尾部，然后将该尾部设为头部的key。

Answer 2

我希望REMOVE-DUPLICATES能够实现线性时间。（事实上它确实*在我当地的SBCL安装上。）

请注意，REMOVE-DUPLICATES和DELETE-DUPLICATES指定具有相同的返回值，并且不保证DELETE-DUPLICATES的副作用。

*线性时间码路径仅在以下情况下进行：test为＃'eq，＃'eql，＃'等于或＃'equalp（它依赖于哈希表）并且没有：key或：test -not参数提供。

Answer 3

语言标准中没有类似的东西。但是，您可以使用loop：

执行此操作

(defun remove-adjacent-duplicates (list &key (test #'eql))
  (loop for obj in list 
        and prev = nil then obj 
        for take = t then (not (funcall test obj prev))
        when take collect obj))

或reduce（练习留给读者）。

有关破坏性的实施，请参阅the other answer。

PS。除非你在时间上做一些棘手的事情，否则你最好不要使用time。

Answer 4

记录：您的测试代码基本上就是这样：

(defun one-shot (n &aux (list (loop for i below n collect i)))
  (time (delete-duplicates list))
  (values))

在缓慢删除重复的情况下与实现维护者交谈可能也很有用。

例如(one-shot 1000000)在Mac上的CCL中运行一秒钟。在LispWorks中，它运行0.155秒。

Answer 5

有点不同的方法：

(defun compress-duplicates (list &key (test #'eql))
  (labels ((%compress-duplicates (head tail)
             (if (null tail)
               (setf (cdr head) tail)
               (progn (unless (funcall test (car head) (car tail))
                        (setf (cdr head) tail head (cdr head)))
                      (%compress-duplicates head (cdr tail))))))
    (%compress-duplicates list (cdr list)) 
    list))

(compress-duplicates (list 1 1 1 2 2 3 4 4 1 1 1))
;; (1 2 3 4 1)

测试SBCL `delete-duplicates`实施：

(defun test-delete-duplicates ()
  (labels ((%test (list)
             (gc)
             (time (delete-duplicates list))))
    (loop
       :repeat 6
       :for list := (loop :for i :from 0 :below 1000
                       :collect (random 100))
       :then (append list list) :do (%test (copy-list list)))))

;; (test-delete-duplicates)

;; Evaluation took:
;;   0.002 seconds of real time
;;   0.002000 seconds of total run time (0.002000 user, 0.000000 system)
;;   100.00% CPU
;;   3,103,936 processor cycles
;;   0 bytes consed

;; Evaluation took:
;;   0.003 seconds of real time
;;   0.003000 seconds of total run time (0.003000 user, 0.000000 system)
;;   100.00% CPU
;;   6,347,431 processor cycles
;;   0 bytes consed

;; Evaluation took:
;;   0.006 seconds of real time
;;   0.006000 seconds of total run time (0.005000 user, 0.001000 system)
;;   100.00% CPU
;;   12,909,947 processor cycles
;;   0 bytes consed

;; Evaluation took:
;;   0.012 seconds of real time
;;   0.012000 seconds of total run time (0.012000 user, 0.000000 system)
;;   100.00% CPU
;;   25,253,024 processor cycles
;;   0 bytes consed

;; Evaluation took:
;;   0.023 seconds of real time
;;   0.022000 seconds of total run time (0.022000 user, 0.000000 system)
;;   95.65% CPU
;;   50,716,442 processor cycles
;;   0 bytes consed

;; Evaluation took:
;;   0.049 seconds of real time
;;   0.050000 seconds of total run time (0.050000 user, 0.000000 system)
;;   102.04% CPU
;;   106,747,876 processor cycles
;;   0 bytes consed

显示线速度。

ECL `delete-duplicates`实施的测试：

;; (test-delete-duplicates)
;; real time : 0.003 secs
;; run time  : 0.003 secs
;; gc count  : 1 times
;; consed    : 95796160 bytes
;; real time : 0.007 secs
;; run time  : 0.006 secs
;; gc count  : 1 times
;; consed    : 95874304 bytes
;; real time : 0.014 secs
;; run time  : 0.014 secs
;; gc count  : 1 times
;; consed    : 95989920 bytes
;; real time : 0.028 secs
;; run time  : 0.027 secs
;; gc count  : 1 times
;; consed    : 96207136 bytes
;; real time : 0.058 secs
;; run time  : 0.058 secs
;; gc count  : 1 times
;; consed    : 96617536 bytes
;; real time : 0.120 secs
;; run time  : 0.120 secs
;; gc count  : 1 times
;; consed    : 97412352 bytes

线性时间也会增加。

加速删除相邻的重复项

5 个答案:

测试SBCL `delete-duplicates`实施：

ECL `delete-duplicates`实施的测试：

加速删除相邻的重复项

5 个答案:

测试SBCL delete-duplicates实施：

ECL delete-duplicates实施的测试：

测试SBCL `delete-duplicates`实施：

ECL `delete-duplicates`实施的测试：