Incanter样本均值和方差不接近分布均值和方差

时间:2014-04-19 06:47:05

标签: clojure incanter

answered一个question关于使用NumPy中的伽马分布生成具有正支持和已知均值和方差的样本。我以为我会在Incanter尝试同样的方法。但与results I got with NumPy不同,我无法得到接近分布均值和方差的样本均值和方差。

(defproject incanter-repl "0.1.0-SNAPSHOT"
  :description "FIXME: write description"
  :url "http://example.com/FIXME"
  :license {:name "Eclipse Public License"
            :url "http://www.eclipse.org/legal/epl-v10.html"}
  :dependencies [[org.clojure/clojure "1.6.0"]
                 [incanter "1.5.4"]])

(require '[incanter 
           [core] 
           [distributions :refer [gamma-distribution mean variance draw]] 
           [stats :as stats]])

(def dist 
  (let [mean 0.71 
        variance 2.89 
        theta (/ variance mean) 
        k (/ mean theta) ] 
    (gamma-distribution k theta)))

Incanter计算分布的均值和方差

(mean dist) ;=> 0.71
(variance dist) ;=> 2.89

我根据该分布的抽取计算样本均值和方差

(def samples (repeatedly 10000 #(draw dist)))

(stats/mean samples) ;=> 0.04595208774029654
(stats/variance samples) ;=> 0.01223348345651905

我预计在样本上计算的这些统计数据更接近分布的均值和方差。我错过了什么?

答案

Incanter从Parallel Colt继承了 bug 。 Parallel Colt中各种方法的参数处理不一致。请参阅问题报告https://github.com/incanter/incanter/issues/245

1 个答案:

答案 0 :(得分:4)

与形成(k)和缩放(theta)作为参数的numpy.random.gamma相反, clojures gamma-distribution将形状(k)和速率(1 / theta)作为参数。 请参阅(doc gamma-distribution)http://en.wikipedia.org/wiki/Gamma_distribution

因此,要获得所需的结果,您可以将dist定义为

(def dist 
  (let [mean 0.71 
        variance 2.89 
        r (/ mean variance) 
        k (* mean r) ] 
    (gamma-distribution k r)))

然后是

的样本结果
(def samples (repeatedly 10000 #(draw dist)))
#'incanter-test.core/samples
incanter-test.core=> (stats/mean samples)
0.7163908381930312
incanter-test.core=> (stats/variance samples)
2.940867216122528