我试图通过np.std(数组,ddof = 0)来计算差异。如果我碰巧有一个长delta数组,即数组中的所有值都相同,就会出现问题。它不是返回std = 0,而是给出一些小的值,这又会导致进一步的估计误差。平均值正确返回... 例如:
np.std([0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1],ddof = 0)
给出1.80411241502e-16
但是
np.std([0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1],ddof = 0)
给出std = 0
有没有办法克服这个问题,除非现在检查每次迭代的数据唯一性而根本不计算std?
由于
P.S。在标记为Is floating point math broken?的重复之后,复制粘贴@kxr的回复,说明为什么这是一个不同的问题:
“当前重复的标记是错误的。它不仅仅是关于简单的浮点数比较,而是关于通过在长数组上使用np.std来实现近零结果的小错误的内部聚合 - 因为提问者指出了额外的。比较例如{ {1}}。所以他可以通过以下方式解决:>>> np.std([0.1, 0.1, 0.1, 0.1, 0.1, 0.1]*200000) -> 2.0808632594793153e-12
“
问题肯定从浮动表示开始,但它并不止于此。 @kxr - 我很感谢评论和示例
答案 0 :(得分:4)
欢迎来到实用数值算法的世界!在现实生活中,如果您有两个浮点数x
和y
,则检查x == y
毫无意义。因此,对于标准偏差是否为0的问题没有意义,它是否接近它。我们使用np.isclose
import numpy as np
>>> np.isclose(1.80411241502e-16, 0)
True
有效地,这是你能想到的最好的。在现实生活中,您甚至无法检查所有物品是否与您建议的相同。他们是浮点数吗?它们是由其他一些过程产生的吗?它们也会有小错误。