How to know if a fraction will be rounded up when represented in floating point format (re: java remainder [%] results when using fp's)

时间:2016-10-15 17:07:14

标签: java floating-point division cpu-architecture

Is there a simple way to tell whether a particular number gets rounded up in it's floating point representation? The reason I ask is related to a question I asked here and a similar question was asked here, amongst others.

To recap, I was trying to ask why, for example, the expression 0.5 % 0.1 doesn't result in approximately zero but instead gives (approximately) 0.1. Many respondents blah on about how most numbers can't be exactly represented and so on but fail to actually explain why, for certain values, the result from the % operator is so far from zero when there is no remainder. It took me a long time to work out what was happening and I think it's worth sharing. Also, it explains why I've asked my question.

It seems that the % operator doesn't result is zero when it should if ths divisor is rounded up in it's floating point format but the dividend isn't. The division algorithm iteratively subtracts the divisor from the dividend until it would result in a negative value. The quotient is the number of iterations and the remainder is what's left of the dividend. It may not be immediately clear why this results in errors (it certainly wasn't to me) so I'll give an example.

For the 0.5 % 0.1 = (approximately) 0.1 case, 0.5 can be represented exactly, but 0.1 cannot and is rounded up. In binary 0.5 is represented simply as 0.1, but 0.1 in binary is 0.00011001100... repeating last 4 digits. Because of the way the floating point format works, this gets truncated to 23 digits (in single precision) after the initial 1. (See the much cited What Every Computer Scientist Should Know About Floating-Point Arithmetic for a full explanation.) Then it's rounded up, as this is closer to the 0.1(decimal) value. So, the values that the division algorithm works with are:

0.1 0000 0000 0000 0000 0000 000 --> 0.5 (decimal), and

0.0001 1001 1001 1001 1001 1001 101 --> 0.1 (decimal)

The division algorithm iterations are;

(1) 1.00000000000000000000000 - 0.000110011001100110011001101 =

(2) 0.011001100110011001100110011 - 0.000110011001100110011001101 =

(3) 0.01001100110011001100110011 - 0.000110011001100110011001101 =

(4) 0.001100110011001100110011001 - 0.000110011001100110011001101 =

(x) 0.0001100110011001100110011 - 0.000110011001100110011001101 =

-0.000000000000000000000000001

As shown, after the 4th iteration further subtraction would result in a negative, so the algorithm stops and the value of the dividend left over (in bold) is the remainder, the approximation of decimal 0.1.

Further, the expression 0.6 % 0.1 works as expected as 0.6 gets rounded up. The expression 0.7 % 0.1 doesn't work as expected and although 0.7 can't be represented exactly, it doesn't get rounded up. I've not tested this exhaustively but I think this is what's going on. Which brings me (at last!) to my actual question:

Does anyone know of simple way to tell if a particular number will be rounded up?

1 个答案:

答案 0 :(得分:0)

让我们考虑浮动a > b > 0时的情况。每个浮点数都是它的ulp的倍数,我们可以写:

a = na*ulp(a). ulp(a)=2^ea。 na是a的整数有效数。 ea是其有偏见的指数 b = nb*ulp(b). ulp(b)=2^eb。 nb是b的整数有效数。 eb是其有偏见的指数 对于规范化浮点数2^p > na >= 2^(p-1),其中p是浮点精度(对于IEEE 754双精度,p = 53位)。

所以我们可以执行(可能很大的)整数除法:na*2^(ea-eb)=nb*q+nr

我们从中推导na*2^(ea-eb)*2^eb = nb*2^eb*q+nr*2^eb,即a=b*q+nr*2^eb 换句话说,在规范化之前,nr是浮动余数的整数有效数,eb是其偏差指数。

由此可见,余数运算是精确的,因为显然nr <= nb,所以余数可表示为float。严格来说,其余部分从未被四舍五入。

当商被舍入到最接近的int而不是截断时,这是IEEE余数运算,

a=b*q+r

然后,余数可以为负r<0
在这种情况下,您感兴趣的是:

a=b*(q-1) + (b+r)

我认为这个带有负r强制b+r结果的情况就是你所谓的四舍五入。不幸的是,没有简单的方法来判断余数是否会在没有执行操作的情况下为负,除非nb是2的幂(2 ^(p-1)或更小,如果逐渐下溢)。

但您似乎对特定案例a=i/10^jb=1/10^j感兴趣,但只有浮点近似值float(i/10^j)float(1/10^j)。假设10 ^ j和i完全可表示(j <23,双精度且i <= 2 ^ 53),那么我们可以使用融合乘法加法来访问表示错误:

ea=fma(10^j,float(i/10^j),-i).  10^j*float(a)=10^j*a+ea.
eb=fma(10^j,float(1/10^j),-1).  10^j*float(b)=10^j*b+eb.

您有i*b=a
现在你要比较浮点近似的方式,这样你就可以得到余数:

r = (a+ea/10^j)-i*(b+eb/10^j) = 1/10^j * ea - i/10^j * eb.

浮点近似可能有效,但并非总是如此:

float(float(float(b)*ea) - float(float(a)*eb))

但是,您最好再次使用fma:

r = fma(-i,eb,ea)/10^j

余数的符号将给出浮点近似的一侧...
这里我们简化了一点问题,因为我们没有考虑当商可以关闭超过1时的情况。那应该没问题,因为我&lt; 2 ^ 53但我们没有证明这一点 它只是一种风格的运用,因为我们用更复杂的表达式取代了一个简单的表达。