从二进制字符串中检索定点值

时间:2014-09-12 22:09:05

标签: python encoding decimal

我正在实施Arithmetic Encoder & Decoder。它的子模块涉及将定点值转换为二进制和反之亦然。

要编码的字符串很长,因此我必须考虑很多精度= 250。一旦我有Decimal值;它被转换为二进制字符串(def decimal_to_binary_str)。问题是从二进制文件Decimal中检索def binary_str_to_decimal值。有人可以指出这个问题吗?

import decimal

def decimal_to_binary_str(f, l):
    res = ""
    var = f - int(f)
    while var != 0 and len(res) < l:
        var = var*2
        res += '{0:.250f}'.format(var)[0]
        var = var - int(var)
    return res

def binary_str_to_decimal(bit_str):
    a = decimal.Decimal(0)
    b = decimal.Decimal(1)
    res = decimal.Decimal(0.5)

    for j in bit_str:
        if j == '0':
            b -= (b-a)/2
            res = res - (b-a)/2
        elif j == '1':
            a += (b-a)/2
            res = res + (b-a)/2
    return res

if __name__ == "__main__":
    decimal.setcontext(decimal.Context(prec=250))
    f = decimal.Decimal('0.2157862006526829178278743649908677246339070461540076509576931506281654871337287537411826845238756634211771470806684727059082493570941004017476336218118077112203256327565095923853074403357680122784788435203804147809638828466085270584355766316314164101')
    print f
    b = decimal_to_binary_str(f, 669)
    print binary_str_to_decimal(b)


  '''
    I have marked with pipe symbol(|) the difference in what is retrieved and what was the original.
    Output: 
0.215786200652682917827874364990867724633907046154007650957693150628165487133728753741182684523875663421177147080668472705908249357094100401747633621811807711220325632756509592385307440335768012278478843|5203804147809638828466085270584355766316314164101  
0.215786200652682917827874364990867724633907046154007650957693150628165487133728753741182684523875663421177147080668472705908249357094100401747633621811807711220325632756509592385307440335768012278478843|3376547904539139004978887353289003223234434034223
    '''

1 个答案:

答案 0 :(得分:1)

问题在于 decimal_to_binary_str 功能。当您在699位之后停止该功能时,您将截断信息。当您将其转换回十进制时,您会看到丢失的信息。如果你使用不同的输出长度,你可以清楚地看到这一点:

f = decimal.Decimal('0.31')
out = decimal_to_binary_str(f, 64)
print binary_str_to_decimal(out)
Out: 0.30999999999999999997506335003283339801782858557999134063720703125
out = decimal_to_binary_str(f, 128)
print binary_str_to_decimal(out)
Out: 0.310000000000000000000000000000000000000411423022787800627789057788027785987236532944870230632528063097197446040809154510498046875
out = decimal_to_binary_str(f, 256)
print binary_str_to_decimal(out)
Out: 0.3100000000000000000000000000000000000000000000000000000000000000000000000000029362973087321111726313596333521358541794401239083356709080591798573212130752251997103125511805779970305860799416908856229783336419228618758420212841997454233933240175247192

它从未真正收敛到原始数字。找到十进制数的二进制表示的问题是复杂的。请查看IEEE 754标准和类似标准。