Question

此代码在python 2.7中运行，在3.5中失败。我想将其转换为3.5。我被困在使用for循环的行为改变数据类型的地方。我是一个熟练的程序员，对python来说比较新，所以这可能很明显，而且我的google-foo一直未能找到这个确切的例子或解决方案。所以我们走了：

以下是此代码中的代码段，适用于2.7： http://trac.nccoos.org/dataproc/browser/DPWP/trunk/DPWP/ADCP_splitter/pd0.py pd0.py打开一个二进制输入流，查找标识字节的记录类型，并将数据分成两个包含相应数据的独立文件，所有文件都是二进制文件。

在下面的代码块中，header，length和ensemble都是字节对象。在python 3.5中，当for循环迭代时会发生一些事情，它会生成int，然后导致struct.unpack失败。您可以在评论中看到我在哪里玩过铸造，引用，所有这些都没有用。我希望详细了解这里发生了什么，以便我可以正确编写更多3.5二进制操作。

失败的是value = struct.unpack('B', byte)[0]

我寻找解决方案的地方：

阅读如何定义字节（你可以迭代，但是如何避开我）
关于str-＆gt;字节的大量讨论，反之亦然，无法解决这个问题
阅读有关unpack如何工作的信息（unpack并不是要解包int，显然）
从2.7转换为3x python
此处处于stackoverflow

提前致谢。这是代码：

def __computeChecksum(header, length, ensemble):
    """Compute a checksum from header, length, and ensemble"""
    # these print as a byte (b'\x7f\x7f' or b'\x7fy') at this point
    print(header)  # header is a bytes object
    cs = 0   
    # so, when the first byte of header is assigned to byte, it gets cast to int.  Why, and how to prevent this?
    for byte in header:
        print(byte) # this prints as an integer at this point, 127 = 0x7F because a bytes object is a "mutable sequence of integers"
        print(type(byte)) # here byte is an int - we need it to be a bytes object for unpack to work
        value = struct.unpack('B', byte)[0]  # this is the line that gets TypeError: a bytes-like object is required, not 'int'
        # this does not work either - from examples online I thought that referencing the first in the array was the problem
        #value = struct.unpack('B', byte)  # this is the line that gets TypeError: a bytes-like object is required, not 'int'
        # this does not work, the error is unpack requires a bytes object of lenth 1, so the casting happened
        #value = struct.unpack('B', bytes(byte))[0] 
        # and this got the error a bytes-like object is required, not 'int', so the [0] reference generates an int
        # value = struct.unpack('B', bytes(byte)[0])[0] 
        cs += value
    for byte in length:
        value = struct.unpack('B', byte)[0]
        cs += value
    for byte in ensemble:
        value = struct.unpack('B', byte)[0]
        cs += value
    return cs & 0xffff

# convenience function reused for header, length, and checksum
def __nextLittleEndianUnsignedShort(file):
    """Get next little endian unsigned short from file"""
    raw = file.read(2)
    """for python 3.5, struct.unpack('<H', raw)[0] needs to return a
       byte, not an int
       Note that it's not a problem here, but in the next cell, when a for loop is involved, we get an error
    """
    return (raw, struct.unpack('<H', raw)[0])

主程序中调用上述函数的代码

while (header == wavesId) or (header == currentsId):
    print('recnum= ',recnum)
    # get ensemble length
    rawLength, length = __nextLittleEndianUnsignedShort(rawFile)
    # read up to the checksum
    rawEnsemble = rawFile.read(length-4)
    # get checksum
    rawChecksum, checksum = __nextLittleEndianUnsignedShort(rawFile)

    computedChecksum = __computeChecksum(rawHeader, rawLength, rawEnsemble)

    if checksum != computedChecksum:
        raise IOError('Checksum error')

最后，错误的全文

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-5e60bd9b9a54> in <module>()
     13     rawChecksum, checksum = __nextLittleEndianUnsignedShort(rawFile)
     14 
---> 15     computedChecksum = __computeChecksum(rawHeader, rawLength, rawEnsemble)
     16 
     17     if checksum != computedChecksum:

<ipython-input-3-414811fc52e4> in __computeChecksum(header, length, ensemble)
     16        print(byte) # this prints as an integer at this point, 127 = 0x7F because a bytes object is a "mutable sequence of integers"
     17        print(type(byte)) # here byte is an int - weneed it to be a bytes object for unpack to work
---> 18        value = struct.unpack('B', byte)[0]  # this is the line that gets TypeError: a bytes-like object is required, not 'int'
     19        # this does not work either - from examples online I thought that referencing the first in the array was the problem
     20        #value = struct.unpack('B', byte)  # this is the line that gets TypeError: a bytes-like object is required, not 'int'

TypeError: a bytes-like object is required, not 'int'

完整的python笔记本在这里： https://gist.github.com/mmartini-usgs/4795da39adc9905f70fd8c27a1bba3da

Answer 1

最优雅的解决方案简直就是：

ensemble = infile.read(ensemblelength)

def __computeChecksum(ensemble):
    cs = 0    
    for byte in range(len(ensemble)-2):
        cs += ensemble[byte]
    return cs & 0xffff

Answer 2

在不知道header是什么以及如何读取数据的情况下回答很复杂。理论上，如果你用rb（读二进制）来读它，那就不应该发生。（实际上是在评论中。）

这是对问题的更好解释。

iterate over individual bytes in python3

我会使用if-clause来获取int，但你可以像在那个答案中那样重新转换为字节。另外，请查看numpy.fromfile。使用IMO更容易。

PS：这是一个很大的细节！如果您遵循SSCCE，您可能会获得更有意义的答案。并且您可以像往常一样将链接发布到完整的笔记本; - ）

我会用你的评论重写你的问题，如：

在Python 3.x上迭代字节时，我得到的是int而不是字节。是否有可能获得所有字节？

In [0]: [byte for byte in b'\x7f\x7f']
Out[0]: [127, 127]

使用for循环

2 个答案: