我的问题是:
我有一个intervall / multiple interval,让我们说:
[0;0.3]
[0.3;0,8]
[0.8;1]
在每个区间我都有一个正态分布,用
truncnorm() and .rvs()
。
所以我在x轴上有多个“正态分布”。
但是,truncnorm-method期望在intervall中分布的均值和sd。如何在python中计算特定区间的平均值和sd
numpy.mean()
f.e。似乎没有用。而且我得到了奇怪的结果,所以我认为在执行truncnorm之前我的mean / sd计算错误。
谢谢你们
*编辑:对于其他列,其中intervalls不是那么小,它工作正常。 Intervall的数量有限吗?错误发生在f.e.来自
的intervall[0,12; 0,17] - GT;值0,0937818650369(超出范围)*
是的,确定。 我想要做的是:我有一个Intervall,给我一个Value,它在该intervall的边界之间,以简单的方式截断正态分布。我有一个额外的列,它应该通过在另一列中采样来写出我获得的值。 例如:Intervall [0.2; 0.6] - >样本值0.343433 我想我找到了一个解决方案:
truncnorm().stats()
但我不知道为什么,但是对于我给出的参数
truncnorm()
功能,我获得的几乎50%的值都在寄宿生之外。我做错了什么?
这是代码(代码的一小部分)
convert_cat=(name_convert_column,name_convert_column,_tabelle,name_convert_column,_tabelle,_tabelle,name_convert_column)
drop_view=(name_convert_column)
calculate=(name_convert_column,name_convert_column,name_convert_column,name_convert_column,name_convert_column,_tabelle,name_convert_column,name_convert_column)
cur.execute("CREATE VIEW convert_cat_%s (quotient, %s, rnum) AS SELECT (COUNT(*)/(SELECT COUNT(*) FROM %s ) ) as quotient, %s, row_number() over ( order by (COUNT(*)/(SELECT COUNT(*) FROM %s ) ) desc ) as rnum FROM %s GROUP BY %s ORDER BY quotient desc" %convert_cat)
cur.execute("Select b.ID,a.unten,a.oben, a.mean, a.sd FROM( SELECT t3.RNUM, t3.%s, lag(t3.com_Pr,1,0) OVER (order by rnum asc) as unten , t3.com_PR as oben, ((t3.com_PR +(lag(t3.com_Pr,1,0) OVER (order by rnum asc)))/2) as MEAN, ((t3.com_PR-(lag(t3.com_Pr,1,0) OVER (order by rnum asc)))/6) AS SD FROM( SELECT t1.rnum, t1.%s , SUM(t2.quotient) as com_Pr FROM CONVERT_CAT_%s t1 INNER JOIN CONVERT_CAT_%s t2 ON t1.rnum >= t2.rnum group by t1.rnum, t1.%s, t1.quotient ORDER BY RNUM asc ) t3) a INNER JOIN %s b ON b.%s = a.%s order by ID asc" %calculate)
_content_category = cur.fetchall()
add_category_number_column = (_tabelle, name_convert_column)
cur.execute("ALTER TABLE %s ADD %s_category NUMBER(15,14)" % add_category_number_column)
x=0
for ID in _content_category:
id = _content_category[0]
id_category = [j[0] for j in _content_category]
unten_category = [j[1] for j in _content_category]
oben_category = [j[2] for j in _content_category]
#mean_category = [j[3] for j in _content_category]
sd_category = [j[4] for j in _content_category]
mean, var = truncnorm.stats(unten_category[x], oben_category[x], moments='mv')
# sd = np.sqrt(var)
X = get_truncated_normal(mean= mean, sd=sd_category[x], low=unten_category[x], upp=oben_category[x])
update_cells_value = float(X.rvs(1))
category = (_tabelle, name_convert_column,update_cells_value,id_category[x])
cur.execute("UPDATE %s SET %s_category = %s WHERE ID=%s" % category)
x += 1
我尝试使用
计算sql查询中的mean和sd1) ((t3.com_PR +(lag(t3.com_Pr,1,0) OVER (order by rnum asc)))/2) as MEAN
2) ((t3.com_PR-(lag(t3.com_Pr,1,0) OVER (order by rnum asc)))/6) AS SD
和
truncnorm().stats()
功能。似乎使用统计功能,结果变得更糟,并且之前的值甚至超出范围...
答案 0 :(得分:0)
尽管我无法运行您的示例,但可能存在一个问题:
for ID in _content_category:
id = _content_category[0]
...
最好是:
for ID in _content_category:
id = _content_category[ID]
...