Question

任务声明：连接两个1e7元素列表并找到它们的总和。我试图找出一种在Clojure中写这个的惯用方法。如果有必要，也可能是一种快速的非惯用方式。

这是我到目前为止所得到的：

(def a (doall (vec (repeat 1e7 1))))
(def b (doall (vec (repeat 1e7 1))))
(println "Clojure:")
(time (def c (concat a b)))
(time (reduce + c))

这是结果，使用1.9.0和shell命令clojure -e '(load-file "clojure/concat.clj")'：

Clojure:
"Elapsed time: 0.042615 msecs"
"Elapsed time: 636.798833 msecs"
20000000

与使用STL算法（60ms）的Python（156ms），Java（159ms），SBCL（120ms）和C ++中的普通实现相比，还有很大的改进空间。

Answer 1

我很好奇只是添加数字与内存分配之间的权衡，所以我写了一些使用Clojure向量和原始（java）数组的测试代码。结果：

; verify we added numbers in (range 1e7) once or twice 
(sum-vec)        => 49999995000000 
(into-sum-vec)   => 99999990000000

ARRAY  power =  7 
"Elapsed time: 21.840198 msecs"       ; sum once 
"Elapsed time: 45.036781 msecs"       ; 2 sub-sums, then add sub-totals 
(timing (sum-sum-arr)) => 99999990000000 
"Elapsed time: 397.254961 msecs"      ; copy into 2x array, then sum 
(timing (sum-arr2)) => 99999990000000

VECTOR  power =  7  
"Elapsed time: 112.522111 msecs"    ; sum once from vector 
"Elapsed time: 387.757729 msecs"    ; make 2x vector, then sum

所以我们看到，使用原始long数组（在我的机器上），我们需要21 ms来求和1e7整数。如果我们两次这样做并加上小计，我们得到45毫秒的经过时间。

如果我们分配一个长度为2e7的新数组，在第一个数组中复制两次，然后对这些值求和，我们得到大约400ms，比单独添加慢8倍。所以我们看到内存分配＆amp;复制是迄今为止最大的成本。

对于本机Clojure矢量情况，我们看到112 ms的时间只是总结1e7整数的预分配矢量。将orig向量与其自身组合成2e7向量，然后将成本约为400ms，类似于低级数组情况。因此，我们看到，对于大型数据列表，内存IO成本超过了本机Java数组与Clojure向量的细节。

以上代码（需要[tupelo "0.9.69"]）：

(ns tst.demo.core
  (:use tupelo.core tupelo.test)
  (:require [criterium.core :as crit]))

(defmacro timing [& forms]
; `(crit/quick-bench ~@forms)
  `(time ~@forms)
  )

(def power 7)
(def reps (Math/pow 10 power))

(def data-vals (range reps))
(def data-vec (vec data-vals))
(def data-arr (long-array data-vals))

; *** BEWARE of small errors causing reflection => 1000x slowdown ***
(defn sum-arr-1 []
  (areduce data-arr i accum 0
    (+ accum (aget data-arr i)))) ;      =>  6300 ms (power 6)
(defn sum-arr []
  (let [data ^longs data-arr]
    (areduce data i accum 0
      (+ accum (aget data i))))) ;       =>     8 ms (power 6)

(defn sum-sum-arr []
  (let [data   ^longs data-arr
        sum1   (areduce data i accum 0
                 (+ accum (aget data i)))
        sum2   (areduce data i accum 0
                 (+ accum (aget data i)))
        result (+ sum1 sum2)]
    result))

(defn sum-arr2 []
  (let [data   ^longs data-arr
        data2  (long-array (* 2 reps))
        >>     (dotimes [i reps] (aset data2 i (aget data i)))
        >>     (dotimes [i reps] (aset data2 (+ reps i) (aget data i)))
        result (areduce data2 i accum 0
                 (+ accum (aget data2 i)))]
    result))


(defn sum-vec      [] (reduce + data-vec))
(defn into-sum-vec [] (reduce + (into data-vec data-vec)))

(dotest
  (is= (spyx (sum-vec))
    (sum-arr))

  (is= (spyx (into-sum-vec))
    (sum-arr2)
    (sum-sum-arr))

  (newline) (println "-----------------------------------------------------------------------------")
  (println "ARRAY  power = " power)
  (timing (sum-arr))
  (spyx (timing (sum-sum-arr)))
  (spyx (timing (sum-arr2)))

  (newline) (println "-----------------------------------------------------------------------------")
  (println "VECTOR  power = " power)
  (timing (sum-vec))
  (timing (into-sum-vec))

)

您可以通过更改time宏中的注释行，从timing切换到使用Criterium。但是，Criterium适用于短期任务，您应该将power保持为5或6。

如何使这个Clojure代码更快？

1 个答案: