测试clojure数据结构时无法解释结果

时间:2014-01-17 22:26:53

标签: testing data-structures clojure

我通常对clojure和函数式编程很新。由于对数据结构的一些基本操作的速度感到好奇(clojures默认和我可能实现的一些),我写了一些东西来自动化测试操作,比如添加到结构中。

我对3个数据结构进行测试的方法始终会返回非常不同的平均运行时间,具体取决于它的输入保持不变时的调用方式。

底部带有测试和结果的代码

(import '(java.util Date))  

(defrecord test-suite ;;holds the test results for 3 datastructures  
[t-list  
 t-vector  
 t-set]
)

(defrecord test-series ;;holds list of test results (list of test-suite) and the list of functions used in the respective tests  
[t-suites  
 t-functions])  

;;;Runs the test, returns time it took  
(defn time-test [func init-ds delta-list]  
  (def startTime (. (new Date) (getTime)))  
  (reduce func init-ds delta-list)
  (def endTime (. (new Date) (getTime)))  
  (- endTime startTime)  
)  

;;;Runs the test x number of times returning the average run time  
(defn test-struct ([iter func init-ds delta-list] (test-struct iter func init-ds delta-list ()))  
  ([iter ;;number of times to run tests  
    func ;;function being tested (add remove etc)  
    init-ds ;;initial datastructure being tested  
    delta-list  
    addRes ;;test results  
    ]  
  (println (first addRes));;print previous run time for debugging  
  ;;test if done recursing  
  (if (> iter 0)  
    (test-struct   
     (- iter 1)   
     func  
     init-ds  
     delta-list  
     (conj addRes (time-test func init-ds delta-list)))  
    (/ (reduce + addRes) (count addRes)))  
))  

;;;Tests a function on a passed in data structure and a randomly generated list of numbers  
(defn run-test   
  [iter ;;the number of times each test will be run  
   func ;;the function being tested  
   init-ds] ;;the initial datstructure being tested  
  (def delta-list (shuffle (range 1000000)));;the random values being added/removed/whatever from the ds  
  (println init-ds)  
  (println iter)  
  (test-suite.  
   ;;the list test  
   (test-struct iter func (nth init-ds 0) delta-list)  
   ;;the vector test  
   (test-struct iter func (nth init-ds 1) delta-list)  
   ;;the set test  
   (test-struct iter func (nth init-ds 2) delta-list)  
   )  
)  

;;;Calls run-test a number of times storing the results as a list in a test-series data   structure along with the list of functions tested.  
(defn run-test-set  
  ([iter func-list ds-list] (run-test-set iter (test-series. nil func-list) func-list     ds-list))  
  ([iter ;;the number of times each test is run before being averaged  
   series ;;data-structure that aggregates the list of test results, and ultimately is     returned  
   func-list ;;the list of functions to be tested  
   ds-list] ;;the list of initial datastructures to be tested  
  (if (> (count func-list) 0)  
    (run-test-set ;;recursively run this aggregateing test-suites as we go  
     iter   
     (test-series. ;;create a new test series with all the functions and suites run so     far  
      (conj (:t-suites series) ;;run a test suite and append it to those run so far  
            (run-test iter (first func-list) (first ds-list)))  
      (:t-functions series))  
     (rest func-list)  
     (rest ds-list)  
     )  
    series)) ;;finished with last run return results  
)  

测试
所有时间都在ms

;;;;;;;;;;;;;;;;;;EVALUATING 'run-test' directly
;;;get average speeds for adding 100000 random elements to list vector and set
;;;run the test 20 times and average the results
(run-test 20 conj '(() [] #{}))
;;;;;RESULT
#test.test-suite{:t-list 254/5, :t-vector 2249/20, :t-set 28641/20}  

或大约51 112和1432列表向量和分别设置

;;;;;;;;;;;;;;;;;;EVALUATING using 'run-test-set' which calls run-test
(run-test-set   
 20              ;;;times the test is run 
 '(conj)         ;;;just run conj (adding to the ds for now)
 '((() [] #{}))   ;;;add random values to blank structures
 )  
;;;;RESULT
#test.test-series{
  :t-suites (
    #test.test-suite{
      :t-list 1297/10,
      :t-vector 1297/10,
      :t-set 1289/10}) ;;;;;;;;;;;;Result of adding values
  :t-functions (conj)}
对于列表向量和集合,

或大约130,这与上面的向量大致相同

有谁知道为什么它会根据其运行方式返回如此不同的结果? 这个clojure是相关的还是可能是Java正在做的优化?

1 个答案:

答案 0 :(得分:2)

测试clojure代码性能的正确方法是criterium。除此之外,标准报告有关代码执行时间分布的统计信息,并确保在进行测量之前预热jvm热点编译器。 jvm热点编译器可能是您看到这些性能差异的原因。

请勿在{{1​​}}内使用defdefn专为顶级全局定义而设计。使用def进行仅在一个函数内使用的绑定。

定义仅使用一次并且仅存在一些变量的记录在clojure中不是惯用的,定义类的开销大于它们可能给你的任何好处(如果不是,则增加了解代码的难度)你的代码的表现)。保存记录,了解何时需要专门化协议或在紧密循环中提高性能。

如果您的优先级是数字的人类可读性,而不是准确性,则可以使用let强制使用更易读的格式进行打印。

以下是如何以惯用方式测试您感兴趣的属性(来自repl会话的脚本,尽管这也可以从-main函数运行):

double