Question

如果在C中编写了一个程序，可以了解重复除法时浮点误差的大小。

#include <stdio.h>

int main (int argc, char* argv[]) {
    if (argc < 3) {
        printf("Enter a decimal number as the first positional " 
                "argument\n");
        printf("Enter the maximum number of digits to print as the " 
                "second positional argument\n");
        return 0;
    }   

    long double d;
    sscanf(argv[1], "%Lf", &d);
    int m;
    sscanf(argv[2], "%d", &m);

    int i;
    char format[10];
    for (i = 1; i <= m; ++i) {
        printf("(%d digits)\n", i); 
        sprintf(format, "%%.%dLf\n\n", i); 
        printf(format, d); 
    }   

    long double p = d;
    printf("\n");
    for (i = 1; i <= m; ++i) {
        printf("(%Lf/10e%d with %d digits)\n", d, i, m); 
        p = p/(long double)10.0;
        printf(format, p); 
    }
    return 0;
}

使用以下参数

运行时，这是输出的一行

$ fpe 0.1 700
.
.
.
(0.100000/10e180 with 700 digits)
0.0000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000999999999999999999969819570700939858153376
736698732853283605408116087882762948991724868957176649769045358705872354052
261113540314114885779914335315639806061208847920179776799404948795506248532
485303630811119507604985596684233990126219304092175565232198569923253737561
276484626462077772036038845251286782974821021132356946292172207615386395848
331484216638642723800290357587296443408362280895970909637712494349003491485
594533190659822910753768473307578901199121901299804449081420898437500000000
000000000000000000000000000
.
.
.

这里我们观察浮点噪声的485位数。这是用gcc 4.4.3编译的，我假设它是使用80位扩展精度。但是，485位十进制数字超过80位信息。所以，我的问题是，这些信息来自哪里？

Answer 1

没有打印额外信息。打印的值恰好是p的值。

经过180次迭代后，p为+ 0x1.A8E90F9908E0CA56p-602，即15309010345804195115•2 ^-665。 IEEE 754标准定义浮点数的值为符号（+1或-1）乘以2的整数幂（由数字的指数字段确定）乘以其有效数的值（分数部分）。所以每个浮点数都有一个特定的值。以上是p的值。十进制，该值是恰好.9999999999999999999698195707009398581533767366987328532836054081160878827629489917248689571766497690453587058723540522611135403141148857799143353156398060612088479201797767994049487955062485324853036308111195076049855966842339901262193040921755652321985699232537375612764846264620777720360388452512867829748210211323569462921722076153863958483314842166386427238002903575872964434083622808959709096377124943490034914855945331906598229107537684733075789011991219012998044490814208984375•10 ^-181

这是您的程序产生的价值。因此，您的输出格式化程序已准确打印p的值。它做得很好。

事实上，在所有周围，浮点都做得很好。该值是最接近10 ^-181的长双值。在长双程中不可能更接近。因此，即使经过数百次算术运算，错误也没有增长。

此处没有新信息。如果我们被告知p表示中的位，我们可能产生相同的数百个十进制数字。他们没有告诉你任何新的东西。但是，它们也不是垃圾;它们完全取决于p的值。

Answer 2

为了向Eric的优秀答案添加一些进一步的信息，第181次迭代计算了你的方式，恰好是最接近10 ^ -181的长双数，但这对每个n都不起作用......

例如，以long double计算时1/10.0/10.0/10.0/10.0 != 1/10000.0。

在吱吱作响的Smalltalk http://code.google.com/p/arbitrary-precision-float/中使用我自己的浮动仿真包，我可以说在前300个10 ^ -n中，77是最近的长双值，223不是。

(1 to: 300) count: [:n |
    ((1 to: n) inject: (1 asArbitraryPrecisionFloatNumBits: 64) into: [:p :i | p/10])
    ~= ((10 raisedTo: n negated) asArbitraryPrecisionFloatNumBits: 64)]

差异峰值为4 ulp，10 ^ -218。

(1 to: 300) detectMax: [:n |
    (((1 to: n) inject: (1 asArbitraryPrecisionFloatNumBits: 64) into: [:p :i | p/10])
    - ((10 raisedTo: n negated) asArbitraryPrecisionFloatNumBits: 64)) abs
    / (2 raisedTo: -63+((10 raisedTo: n negated) floorLog: 2))].

以下是ulp的错误演变：

(1 to: 300) collect: [:n |
    ((((1 to: n) inject: (1 asArbitraryPrecisionFloatNumBits: 64) into: [:p :i | p/10])
    - ((10 raisedTo: n negated) asArbitraryPrecisionFloatNumBits: 64))
    / (2 raisedTo: -63+((10 raisedTo: n negated) floorLog: 2))) asInteger].

#(0  0  0 -1 -1 -1 -1 -1 -1 -2 -1 -1 -1 -1 -1 -1 -1 -1 -2 -2
 -1 -1 -1 -1 -1 -1  0 -1  0  0  0  1  0  0  0  0  1  0  0  0
  0  0  0 -1  0  0  0  1  0  0  0  0  0  0  1  1  1  1  0  0
  0  0 -1  0 -1 -1 -1 -1 -2 -1 -1 -2 -2 -2 -3 -2 -2 -3 -2 -2
 -3 -2 -2 -2 -2 -1 -2 -1 -1 -2 -2 -2 -1 -2 -2 -1 -2 -2 -1 -2
 -2 -2 -3 -2 -1 -2 -2 -1 -2 -2 -1 -2 -2 -2 -3 -2 -2 -1 -1 -1
 -1 -1 -1 -1 -2 -2 -1 -3 -2 -2 -3 -2 -2 -3 -3 -2 -2 -2 -2 -3
 -2 -2 -3 -3 -2 -3 -2 -2 -2 -3 -2 -2 -3 -2 -1 -2 -2 -1 -2 -1
 -1 -2 -1 -1 -1  0  0  0  0  0  1  1  0  0  0  0  0  1  0  0
  0  0  0  0 -1 -1 -1 -1  0  0  0  0 -1 -1 -1 -2 -1  0 -1 -1
 -1 -1 -1 -2 -1 -1 -1 -1 -2 -2 -2 -2 -2 -2 -3 -3 -2 -4 -3 -2
 -3 -2 -2 -3 -2 -2 -2 -2 -1 -3 -2 -2 -3 -3 -2 -1 -2 -2 -1 -2
 -2 -1 -3 -2 -2 -3 -3 -2 -3 -2 -1 -1 -1  0  0  0  0  0 -1  0
  0 -1  0  0 -1  0  0  0  0  0 -1 -1  0  0 -1 -1 -1 -1  0  0
  0  1  1  1  0  1  1  1  1  0  1  1  1  1  1  1  0  0  0  0)

长双精度浮点错误

2 个答案: