Question

我正在解决一个关于浮动下溢的C Primer Plus练习。任务是模拟它。我是这样做的：

db.new_collection.ensureIndex({my_key:1}); //for performance, not a necessity
db.old_collection.find({}).noCursorTimeout().forEach(function(doc) {

    db.new_collection.update(
       { my_key: doc.my_key },
       { 
           $push: { stuff: doc.stuff, other_stuff: doc.other_stuff},
            $inc: { thing: doc.thing},
       },
       { upsert: true }
    );

});

结果是

#include<stdio.h>
#include<float.h>

int main(void)
{
    // print min value for a positive float retaining full precision
    printf("%s\n %.150f\n", "Minimum positive float value retaining full precision:",FLT_MIN);

    // print min value for a positive float retaining full precision divided by two
    printf("%s\n %.150f\n", "Minimum positive float value retaining full precision divided by two:",FLT_MIN/2.0);

    // print min value for a positive float retaining full precision divided by four
    printf("%s\n %.150f\n", "Minimum positive float value retaining full precision divided by four:",FLT_MIN/4.0);

    return 0;
}

我预计min float值除以2和4的精度会降低，但看起来精度还可以，并且没有下溢情况。这怎么可能？我错过了什么？

非常感谢

Answer 1

评估精度的错误方法简单地将FLT_MIN（当然是2的幂）除以2。

取而代之的是一个刚好超过2的幂的数字，所以它的二进制 significand就像1.000...(maybe total of 24 binary digits)...0001。确保打印的值最初为float。（FLT_MIN/2.0是double。）

请注意，当数字小于FLT_MIN时，精度会丢失：最小规范化正浮点数。

还要考虑FLT_TRUE_MIN：最小正浮点数。见binary32

#include <float.h>
#include <math.h>
#include <stdio.h>

int main(void) {
  char *format = "%.10e %a\n";
  printf(format, FLT_MIN, FLT_MIN);
  printf(format, FLT_TRUE_MIN, FLT_TRUE_MIN);

  float f = nextafterf(1.0f, 2.0f);
  do {
    f /= 2;
    printf(format, f, f);  // print in decimal and hex for detail
  } while (f);
  return 0;
}

输出

1.1754943508e-38 0x1p-126
1.4012984643e-45 0x1p-149

5.0000005960e-01 0x1.000002p-1
2.5000002980e-01 0x1.000002p-2
1.2500001490e-01 0x1.000002p-3
...
2.3509889819e-38 0x1.000002p-125
1.1754944910e-38 0x1.000002p-126
5.8774717541e-39 0x1p-127  // lost least significant bit of precision
2.9387358771e-39 0x1p-128
...
2.8025969286e-45 0x1p-148
1.4012984643e-45 0x1p-149
0.0000000000e+00 0x0p+0

在C说明中浮动下溢

1 个答案: