Question

给定一种典型的编程语言。

我有两个数字浮点数a和b彼此接近（即它们的差值比绝对值的绝对值小得多）。

| A-B | ＆LT;＆LT; | A + B | / 2

数学上，我们有

exp（a-b）= exp（a）/ exp（b）

但是当你编程时，你可以选择先计算（a-b）然后取幂，或取幂a，然后取b，然后除以它们。

如果a和b非常接近，那么（a-b）的精度可能会很差。

示例

（1 + pi * 10 ^ -20） - （1 + 1.1 * pi * 10 ^ -20）= -pi * 10 ^ -21

但是如果您使用的浮点只有19个小数点的精度。你会得到零作为答案，这是一个糟糕的精度。您可以通过重新排序操作获得更好的精度，如下所示

（1-1）+（pi * 10 ^ -20 -1.1 * pi * 10 ^ -20）= -pi * 10 ^ -21

会给你-pi * 10 ^ -21，精度为19小数点。

因此，我的问题是，给定一个有限的浮点精度，哪种计算exp（a-b）的方法可以得到更好的精度？

差异的指数：

EXP（A-B）

或指数的商

EXP（A）/ EXP（b）中

Answer 1

如果a和b非常接近，那么（a-b）的精度可能会很差。

相反，如果a和b关闭，则a - b是准确的（Sterbenz lemma）。因此exp(a-b)仅涉及一个舍入步骤（假设正确舍入exp函数以简化推理）。因此，exp(a-b)为您尝试计算的表达式提供了正确舍入的结果。任何其他方式都不会比这更好。

当exp(a-b)和exp(a)单独溢出时exp(b)明显优于替代方案的情况，导致分区的NaN。相比之下，exp(a-b)仅在数学结果的最佳浮点近似值时生成+inf，并且永远不会为有限a和b生成NaN。

注意：您可能遇到的问题是，a和b的计算大致相当于a - b的相对准确性非常糟糕。这是浮点（cancellation）的固有问题，但在以任何方式计算exp（a-b）时，尝试解决此问题为时已晚：信息已经丢失。您只能计算a和b的exp（a-b）。正确的方法是使用exp(a-b)。

对于希望额外精度位为最终结果提供额外精度的简单表达式，还可以通过计算具有额外精度的参考结果来凭经验测试这些事物。如果我以这种方式接近问题，我可能已经编写了C程序（对于我的平台，其中FLT_EVAL_METHOD由编译器定义为0，其中long double是IEEE 754 80位双扩展）：

#include <math.h>
#include <stdlib.h>
#include <stdio.h>

double best(double a, double b) {
  return exp(a-b);
}

double other(double a, double b) {
  return exp(a) / exp(b);
}

long double reference(long double a, long double b) {
  // assume the method doesn't matter so much with
  // long double computations, use any method:
  return expl(a-b);
}

int main(void) {
  for (int i = 0; i < 10; i++) {
    double a = rand() ^ ((long long)rand())<<16 ^ ((long long)rand()) << 32;
    a /= 0x1.0p64;
    double b = a * (1 + (double)rand() / (5.0 * RAND_MAX));
    printf("a=%a\nb=%a\n", a, b);

    double be = best(a, b);
    double o = other(a, b);

    long double r = reference(a, b);

    printf("error sub then exp:%La\n", fabsl(r - be));
    printf("error exp then div:%La\n", fabsl(r - o));
  }
}

运行上述程序会产生结果：

~ $ gcc ex.c && ./a.out
a=0x1.82def03cebc5p-2
b=0x1.a65bc869d479cp-2
error sub then exp:0xa.dp-60
error exp then div:0xa.dp-60
a=0x1.8164b7a7be6dep-6
b=0x1.b5b82c63fa0bcp-6
error sub then exp:0x8.1p-58
error exp then div:0x8.1p-58
a=0x1.88b779e295f18p-3
b=0x1.b1836e39a5186p-3
error sub then exp:0x8.5p-59
error exp then div:0xd.ecp-57
a=0x1.b5f437ec6fc4ap-6
b=0x1.e459d0a1d64b9p-6
error sub then exp:0xa.1p-59
error exp then div:0xd.7cp-57
a=0x1.889e1b20a7c9dp-3
b=0x1.8dddc5217d12fp-3
error sub then exp:0x9.4p-61
error exp then div:0xf.6cp-57
a=0x1.2d8f07147984cp-2
b=0x1.65acc944015e6p-2
error sub then exp:0x8.ep-58
error exp then div:0x8.ep-58
a=0x1.78b8432157465p-5
b=0x1.a9fd15e131562p-5
error sub then exp:0x9.4p-61
error exp then div:0xf.6cp-57
a=0x1.d214f566a0e1ep-2
b=0x1.0c90cb63e50c4p-1
error sub then exp:0xd.54p-58
error exp then div:0xd.54p-58
a=0x1.78dfa1c5c2ac4p-2
b=0x1.919d3723ae61p-2
error sub then exp:0x9.e8p-59
error exp then div:0x9.e8p-59
a=0x1.fb68c3c57fdd3p-2
b=0x1.103e0374df0bbp-1
error sub then exp:0xd.54p-58
error exp then div:0x9.56p-57

什么能给出最佳精确度，指数的差异或指数的商数？

1 个答案: