考虑(a-b)/(c-d)
操作,其中a
,b
,c
和d
是浮点数(即double
类型C ++)。 (a-b)
和(c-d)
都是(sum
- correction
)对,如Kahan summation algorithm中所示。简而言之,这些(sum
- correction
)对的具体内容是sum
包含相对于correction
中的值较大的值。更准确地说,correction
包含由于数值限制(sum
类型中53位尾数)而在求和期间不适合double
的内容。
考虑到数字的上述特征,计算(a-b)/(c-d)
的数值最精确的方法是什么?
额外问题:最好将结果也设为(sum
- correction
),就像在Kahan求和算法中一样。所以要查找(e-f)=(a-b)/(c-d)
,而不仅仅是e=(a-b)/(c-d)
。
答案 0 :(得分:4)
The div2
algorithm of Dekker (1971) is a good approach.
It requires a mul12(p,q)
algorithm which can exactly computes a pair u+v = p*q
. Dekker uses a method known as Veltkamp splitting, but if you have access to an fma
function, then a much simpler method is
u = p*q
v = fma(p,q,-u)
the actual division then looks like (I've had to change some of the signs since Dekker uses additive pairs instead of subtractive):
r = a/c
u,v = mul12(r,c)
s = (a - u - v - b + r*d)/c
The the sum r+s
is an accurate approximation to (a-b)/(c-d)
.
UPDATE: The subtraction and addition are assumed to be left-associative, i.e.
s = ((((a-u)-v)-b)+r*d)/c
This works because if we let rr
be the error in the computation of r
(i.e. r + rr = a/c
exactly), then since u+v = r*c
exactly, we have that rr*c = a-u-v
exactly, so therefore (a-u-v-b)/c
gives a fairly good approximation to the correction term of (a-b)/c
.
The final r*d
arises due to the following:
(a-b)/(c-d) = (a-b)/c * c/(c-d) = (a-b)/c *(1 + d/(c-d))
= [a-b + (a-b)/(c-d) * d]/c
Now r
is also a fairly good initial approximation to (a-b)/(c-d)
so we substitute that inside the [...]
, so we find that (a-u-v-b+r*d)/c
is a good approximation to the correction term of (a-b)/(c-d)
答案 1 :(得分:0)
对于微小的修正,可能会想到
(a - b) / (c - d) = a/b (1 - b/a) / (1 - c/d) ~ a/b (1 - b/a + c/d)