我通常对clojure和函数式编程很新。由于对数据结构的一些基本操作的速度感到好奇(clojures默认和我可能实现的一些),我写了一些东西来自动化测试操作,比如添加到结构中。
我对3个数据结构进行测试的方法始终会返回非常不同的平均运行时间,具体取决于它的输入保持不变时的调用方式。
底部带有测试和结果的代码
(import '(java.util Date))
(defrecord test-suite ;;holds the test results for 3 datastructures
[t-list
t-vector
t-set]
)
(defrecord test-series ;;holds list of test results (list of test-suite) and the list of functions used in the respective tests
[t-suites
t-functions])
;;;Runs the test, returns time it took
(defn time-test [func init-ds delta-list]
(def startTime (. (new Date) (getTime)))
(reduce func init-ds delta-list)
(def endTime (. (new Date) (getTime)))
(- endTime startTime)
)
;;;Runs the test x number of times returning the average run time
(defn test-struct ([iter func init-ds delta-list] (test-struct iter func init-ds delta-list ()))
([iter ;;number of times to run tests
func ;;function being tested (add remove etc)
init-ds ;;initial datastructure being tested
delta-list
addRes ;;test results
]
(println (first addRes));;print previous run time for debugging
;;test if done recursing
(if (> iter 0)
(test-struct
(- iter 1)
func
init-ds
delta-list
(conj addRes (time-test func init-ds delta-list)))
(/ (reduce + addRes) (count addRes)))
))
;;;Tests a function on a passed in data structure and a randomly generated list of numbers
(defn run-test
[iter ;;the number of times each test will be run
func ;;the function being tested
init-ds] ;;the initial datstructure being tested
(def delta-list (shuffle (range 1000000)));;the random values being added/removed/whatever from the ds
(println init-ds)
(println iter)
(test-suite.
;;the list test
(test-struct iter func (nth init-ds 0) delta-list)
;;the vector test
(test-struct iter func (nth init-ds 1) delta-list)
;;the set test
(test-struct iter func (nth init-ds 2) delta-list)
)
)
;;;Calls run-test a number of times storing the results as a list in a test-series data structure along with the list of functions tested.
(defn run-test-set
([iter func-list ds-list] (run-test-set iter (test-series. nil func-list) func-list ds-list))
([iter ;;the number of times each test is run before being averaged
series ;;data-structure that aggregates the list of test results, and ultimately is returned
func-list ;;the list of functions to be tested
ds-list] ;;the list of initial datastructures to be tested
(if (> (count func-list) 0)
(run-test-set ;;recursively run this aggregateing test-suites as we go
iter
(test-series. ;;create a new test series with all the functions and suites run so far
(conj (:t-suites series) ;;run a test suite and append it to those run so far
(run-test iter (first func-list) (first ds-list)))
(:t-functions series))
(rest func-list)
(rest ds-list)
)
series)) ;;finished with last run return results
)
测试
所有时间都在ms
;;;;;;;;;;;;;;;;;;EVALUATING 'run-test' directly
;;;get average speeds for adding 100000 random elements to list vector and set
;;;run the test 20 times and average the results
(run-test 20 conj '(() [] #{}))
;;;;;RESULT
#test.test-suite{:t-list 254/5, :t-vector 2249/20, :t-set 28641/20}
或大约51 112和1432列表向量和分别设置
;;;;;;;;;;;;;;;;;;EVALUATING using 'run-test-set' which calls run-test
(run-test-set
20 ;;;times the test is run
'(conj) ;;;just run conj (adding to the ds for now)
'((() [] #{})) ;;;add random values to blank structures
)
;;;;RESULT
#test.test-series{
:t-suites (
#test.test-suite{
:t-list 1297/10,
:t-vector 1297/10,
:t-set 1289/10}) ;;;;;;;;;;;;Result of adding values
:t-functions (conj)}
对于列表向量和集合,或大约130,这与上面的向量大致相同
有谁知道为什么它会根据其运行方式返回如此不同的结果? 这个clojure是相关的还是可能是Java正在做的优化?
答案 0 :(得分:2)
测试clojure代码性能的正确方法是criterium。除此之外,标准报告有关代码执行时间分布的统计信息,并确保在进行测量之前预热jvm热点编译器。 jvm热点编译器可能是您看到这些性能差异的原因。
请勿在{{1}}内使用def
,defn
专为顶级全局定义而设计。使用def
进行仅在一个函数内使用的绑定。
定义仅使用一次并且仅存在一些变量的记录在clojure中不是惯用的,定义类的开销大于它们可能给你的任何好处(如果不是,则增加了解代码的难度)你的代码的表现)。保存记录,了解何时需要专门化协议或在紧密循环中提高性能。
如果您的优先级是数字的人类可读性,而不是准确性,则可以使用let
强制使用更易读的格式进行打印。
以下是如何以惯用方式测试您感兴趣的属性(来自repl会话的脚本,尽管这也可以从-main函数运行):
double