Python:计算intervall中正态分布的均值和sd

时间:2017-11-03 08:00:52

标签: python normal-distribution

我的问题是:

我有一个intervall / multiple interval,让我们说:

[0;0.3] [0.3;0,8] [0.8;1]

在每个区间我都有一个正态分布,用 truncnorm() and .rvs()

所以我在x轴上有多个“正态分布”。

但是,truncnorm-method期望在intervall中分布的均值和sd。如何在python中计算特定区间的平均值和sd

numpy.mean() f.e。似乎没有用。而且我得到了奇怪的结果,所以我认为在执行truncnorm之前我的mean / sd计算错误。

谢谢你们

*编辑:对于其他列,其中intervalls不是那么小,它工作正常。 Intervall的数量有限吗?错误发生在f.e.来自

的intervall

[0,12; 0,17] - GT;值0,0937818650369(超出范围)*

是的,确定。 我想要做的是:我有一个Intervall,给我一个Value,它在该intervall的边界之间,以简单的方式截断正态分布。我有一个额外的列,它应该通过在另一列中采样来写出我获得的值。 例如:Intervall [0.2; 0.6] - >样本值0.343433 我想我找到了一个解决方案:

truncnorm().stats()

但我不知道为什么,但是对于我给出的参数

truncnorm() 

功能,我获得的几乎50%的值都在寄宿生之外。我做错了什么?

这是代码(代码的一小部分)

      convert_cat=(name_convert_column,name_convert_column,_tabelle,name_convert_column,_tabelle,_tabelle,name_convert_column)
    drop_view=(name_convert_column)
    calculate=(name_convert_column,name_convert_column,name_convert_column,name_convert_column,name_convert_column,_tabelle,name_convert_column,name_convert_column)
    cur.execute("CREATE VIEW convert_cat_%s (quotient, %s, rnum) AS SELECT (COUNT(*)/(SELECT COUNT(*) FROM %s ) ) as quotient, %s, row_number() over ( order by (COUNT(*)/(SELECT COUNT(*) FROM %s ) ) desc ) as rnum FROM     %s  GROUP BY %s ORDER BY quotient desc" %convert_cat)
    cur.execute("Select b.ID,a.unten,a.oben, a.mean, a.sd FROM( SELECT t3.RNUM, t3.%s, lag(t3.com_Pr,1,0) OVER (order by rnum asc) as unten , t3.com_PR as oben, ((t3.com_PR +(lag(t3.com_Pr,1,0) OVER (order by rnum asc)))/2) as MEAN, ((t3.com_PR-(lag(t3.com_Pr,1,0) OVER (order by rnum asc)))/6) AS SD FROM( SELECT t1.rnum, t1.%s , SUM(t2.quotient) as com_Pr FROM CONVERT_CAT_%s t1 INNER JOIN CONVERT_CAT_%s t2 ON t1.rnum >= t2.rnum group by t1.rnum, t1.%s, t1.quotient ORDER BY RNUM asc ) t3) a INNER JOIN %s b ON b.%s = a.%s order by ID asc" %calculate)
    _content_category = cur.fetchall()
    add_category_number_column = (_tabelle, name_convert_column)
    cur.execute("ALTER TABLE %s ADD %s_category NUMBER(15,14)" % add_category_number_column)
    x=0
    for ID in _content_category:
        id = _content_category[0]
        id_category = [j[0] for j in _content_category]
        unten_category = [j[1] for j in _content_category]
        oben_category = [j[2] for j in _content_category]
        #mean_category = [j[3] for j in _content_category]
        sd_category = [j[4] for j in _content_category]
        mean, var = truncnorm.stats(unten_category[x], oben_category[x], moments='mv')
       # sd = np.sqrt(var)
        X = get_truncated_normal(mean= mean, sd=sd_category[x], low=unten_category[x], upp=oben_category[x])
        update_cells_value = float(X.rvs(1))
        category = (_tabelle, name_convert_column,update_cells_value,id_category[x])
     cur.execute("UPDATE %s SET %s_category = %s WHERE ID=%s" % category)

        x += 1

我尝试使用

计算sql查询中的mean和sd
1) ((t3.com_PR +(lag(t3.com_Pr,1,0) OVER (order by rnum asc)))/2) as MEAN
 2) ((t3.com_PR-(lag(t3.com_Pr,1,0) OVER (order by rnum asc)))/6) AS SD

truncnorm().stats()功能。似乎使用统计功能,结果变得更糟,并且之前的值甚至超出范围...

1 个答案:

答案 0 :(得分:0)

尽管我无法运行您的示例,但可能存在一个问题:

 for ID in _content_category:
    id = _content_category[0]
    ...

最好是:

 for ID in _content_category:
    id = _content_category[ID]
    ...