Question

我正在将程序从Python 2移植到Python 3.当值为字节时，我在处理％（插值）运算符时遇到了困难。

假设我们需要从Python 2中移植此表达式：'%s: %s\r\n' % (name, value)。

程序的移植版本中的

name和value属于bytes类型。结果也应该是bytes类型。在Python 3中，二进制插值仅适用于Python 3.5（PEP 460）。所以，不确定我是否正确，但只有两种方法来处理这个问题 - 在适当的情况下连接或字符串编码/解码：

>>> name = b'Host'
>>> value = b'example.com'
>>> # Decode bytes and encode resulting string.
>>> ('%s: %s\r\n' % (name.decode('ascii'), value.decode('ascii'))).encode('ascii')
b'Host: example.com\r\n'
>>> # ... or just use concatenation.
>>> name + b': ' + value + b'\r\n'
b'Host: example.com\r\n'

至于我，这两种解决方案都有点难看。当值为bytes时，是否存在关于如何移植字符串格式的约定/建议？

注意2to3工具不应该被使用，程序应该在Python 2和3下工作。

Answer 1

对于CPython，我创建了bttf库，我将添加一些移植功能;目前它支持将3.5字节格式代码monkeypatching到Python 3.3 ad 3.4：

因此，在您拥有：

之前

>>> b'I am bytes format: %s, %08d' % (b'asdf', 42)
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for %: 'bytes' and 'tuple'

和bttf：

>>> from bttf import install
>>> install('bytes_mod')
>>> b'I am bytes format: %s, %08d' % (b'asdf', 42)
b'I am bytes format: asdf, 00000042'

与__future__不同，修补程序是解释器范围。

Answer 2

解码格式化编码解决方案在这种特殊情况下可能看起来很难看，但它显然是惯用的。

这个想法是你只在内部操作Unicode字符串，并在接收/发送数据时进行解码/编码。这种方法被称为＆＃34; Unicode三明治＆＃34;在Ned Batchelder's "Pragmatic Unicode"。

此外，根据上下文，您可能只想更改name和value为bytes个对象的事实。

移植到Python 3：字符串/字节格式化

2 个答案: