在代码中兼容使用字符串和类似字节的对象,以便在Python 2& 3

时间:2016-08-31 14:18:22

标签: python string python-2.7 python-3.x bytestring

我试图修改下面显示的代码,它在Python 2.7.x中运行,所以它在Python 3.x中也可以不做。但是,我遇到了以下问题,我无法在第一个函数bin_to_float()中解决,如下面的输出所示:

float_to_bin(0.000000): '0'
Traceback (most recent call last):
  File "binary-to-a-float-number.py", line 36, in <module>
    float = bin_to_float(binary)
  File "binary-to-a-float-number.py", line 9, in bin_to_float
    return struct.unpack('>d', bf)[0]
TypeError: a bytes-like object is required, not 'str'

我尝试通过在调用bf = bytes(bf)之前添加struct.unpack()来解决此问题,但这样做会产生自己的TypeError

TypeError: string argument without an encoding

所以我的问题是有可能解决这个问题并实现我的目标吗?如果是这样,怎么样?最好以两种版本的Python工作。

这是在Python 2中运行的代码:

import struct

def bin_to_float(b):
    """ Convert binary string to a float. """
    bf = int_to_bytes(int(b, 2), 8)  # 8 bytes needed for IEEE 754 binary64
    return struct.unpack('>d', bf)[0]

def int_to_bytes(n, minlen=0):  # helper function
    """ Int/long to byte string. """
    nbits = n.bit_length() + (1 if n < 0 else 0)  # plus one for any sign bit
    nbytes = (nbits+7) // 8  # number of whole bytes
    bytes = []
    for _ in range(nbytes):
        bytes.append(chr(n & 0xff))
        n >>= 8
    if minlen > 0 and len(bytes) < minlen:  # zero pad?
        bytes.extend((minlen-len(bytes)) * '0')
    return ''.join(reversed(bytes))  # high bytes at beginning

# tests

def float_to_bin(f):
    """ Convert a float into a binary string. """
    ba = struct.pack('>d', f)
    ba = bytearray(ba)
    s = ''.join('{:08b}'.format(b) for b in ba)
    s = s.lstrip('0')  # strip leading zeros
    return s if s else '0'  # but leave at least one

for f in 0.0, 1.0, -14.0, 12.546, 3.141593:
    binary = float_to_bin(f)
    print('float_to_bin(%f): %r' % (f, binary))
    float = bin_to_float(binary)
    print('bin_to_float(%r): %f' % (binary, float))
    print('')

3 个答案:

答案 0 :(得分:3)

要使用在字面上使用两者之间不同数据类型的库来制作在Python 2和3中使用字节的可移植代码,您需要使用适当的文字标记为每个字符串显式声明它们(或添加{{3执行此操作的每个模块的顶部)。此步骤是为了确保代码内部的数据类型正确。

其次,决定支持Python 3继续,特别是Python 2的回退。这意味着用str覆盖unicode,并找出不返回相同类型的方法/函数在两个Python版本中都应该被修改和替换以返回正确的类型(作为Python 3版本)。请注意,bytes也是一个保留字,所以不要使用它。

将它们组合在一起,您的代码将如下所示:

import struct
import sys

if sys.version_info < (3, 0):
    str = unicode
    chr = unichr


def bin_to_float(b):
    """ Convert binary string to a float. """
    bf = int_to_bytes(int(b, 2), 8)  # 8 bytes needed for IEEE 754 binary64
    return struct.unpack(b'>d', bf)[0]

def int_to_bytes(n, minlen=0):  # helper function
    """ Int/long to byte string. """
    nbits = n.bit_length() + (1 if n < 0 else 0)  # plus one for any sign bit
    nbytes = (nbits+7) // 8  # number of whole bytes
    ba = bytearray(b'')
    for _ in range(nbytes):
        ba.append(n & 0xff)
        n >>= 8
    if minlen > 0 and len(ba) < minlen:  # zero pad?
        ba.extend((minlen-len(ba)) * b'0')
    return u''.join(str(chr(b)) for b in reversed(ba)).encode('latin1')  # high bytes at beginning

# tests

def float_to_bin(f):
    """ Convert a float into a binary string. """
    ba = struct.pack(b'>d', f)
    ba = bytearray(ba)
    s = u''.join(u'{:08b}'.format(b) for b in ba)
    s = s.lstrip(u'0')  # strip leading zeros
    return (s if s else u'0').encode('latin1')  # but leave at least one

for f in 0.0, 1.0, -14.0, 12.546, 3.141593:
    binary = float_to_bin(f)
    print(u'float_to_bin(%f): %r' % (f, binary))
    float = bin_to_float(binary)
    print(u'bin_to_float(%r): %f' % (binary, float))
    print(u'')

我使用latin1编解码器只是因为它是最初定义的字节映射,它似乎有用

$ python2 foo.py 
float_to_bin(0.000000): '0'
bin_to_float('0'): 0.000000

float_to_bin(1.000000): '11111111110000000000000000000000000000000000000000000000000000'
bin_to_float('11111111110000000000000000000000000000000000000000000000000000'): 1.000000

float_to_bin(-14.000000): '1100000000101100000000000000000000000000000000000000000000000000'
bin_to_float('1100000000101100000000000000000000000000000000000000000000000000'): -14.000000

float_to_bin(12.546000): '100000000101001000101111000110101001111110111110011101101100100'
bin_to_float('100000000101001000101111000110101001111110111110011101101100100'): 12.546000

float_to_bin(3.141593): '100000000001001001000011111101110000010110000101011110101111111'
bin_to_float('100000000001001001000011111101110000010110000101011110101111111'): 3.141593

再次,但这次是在Python 3.5)

$ python3 foo.py 
float_to_bin(0.000000): b'0'
bin_to_float(b'0'): 0.000000

float_to_bin(1.000000): b'11111111110000000000000000000000000000000000000000000000000000'
bin_to_float(b'11111111110000000000000000000000000000000000000000000000000000'): 1.000000

float_to_bin(-14.000000): b'1100000000101100000000000000000000000000000000000000000000000000'
bin_to_float(b'1100000000101100000000000000000000000000000000000000000000000000'): -14.000000

float_to_bin(12.546000): b'100000000101001000101111000110101001111110111110011101101100100'
bin_to_float(b'100000000101001000101111000110101001111110111110011101101100100'): 12.546000

float_to_bin(3.141593): b'100000000001001001000011111101110000010110000101011110101111111'
bin_to_float(b'100000000001001001000011111101110000010110000101011110101111111'): 3.141593

它还有很多工作要做,但在Python3中你可以更清楚地看到类型是以正确的字节完成的。我还将您的bytes = []更改为bytearray,以更清楚地表达您要执行的操作。

答案 1 :(得分:1)

我从@ metatoaster的答案中采用了不同的方法。我刚刚修改了int_to_bytes以使用并返回bytearray

def int_to_bytes(n, minlen=0):  # helper function
    """ Int/long to byte string. """
    nbits = n.bit_length() + (1 if n < 0 else 0)  # plus one for any sign bit
    nbytes = (nbits+7) // 8  # number of whole bytes
    b = bytearray()
    for _ in range(nbytes):
        b.append(n & 0xff)
        n >>= 8
    if minlen > 0 and len(b) < minlen:  # zero pad?
        b.extend([0] * (minlen-len(b)))
    return bytearray(reversed(b))  # high bytes at beginning

这似乎在Python 2.7.11和Python 3.5.1下都没有任何其他修改。

请注意,我使用0代替'0'填充零。我没有做太多测试,但肯定是你的意思?

答案 2 :(得分:1)

在Python 3中,整数有一个to_bytes()方法,可以在一次调用中执行转换。但是,既然你要求一个适用于Python 2和3的解决方案未经修改,那么这是另一种方法。

如果您通过十六进制表示绕道而行,则int_to_bytes()函数变得非常简单:

import codecs

def int_to_bytes(n, minlen=0):
    hex_str = format(n, "0{}x".format(2 * minlen))
    return codecs.decode(hex_str, "hex")

当十六进制字符串获得奇数个字符时,您可能需要一些特殊的案例处理来处理这种情况。

请注意,我不确定这适用于所有版本的Python 3.我记得在某些3.x版本中不支持伪编码,但我不记得细节。我用Python 3.5测试了代码。