使用for循环

时间:2017-01-12 17:31:26

标签: python-3.x for-loop byte

此代码在python 2.7中运行,在3.5中失败。我想将其转换为3.5。我被困在使用for循环的行为改变数据类型的地方。我是一个熟练的程序员,对python来说比较新,所以这可能很明显,而且我的google-foo一直未能找到这个确切的例子或解决方案。所以我们走了:

以下是此代码中的代码段,适用于2.7: http://trac.nccoos.org/dataproc/browser/DPWP/trunk/DPWP/ADCP_splitter/pd0.py pd0.py打开一个二进制输入流,查找标识字节的记录类型,并将数据分成两个包含相应数据的独立文件,所有文件都是二进制文件。

在下面的代码块中,header,length和ensemble都是字节对象。在python 3.5中,当for循环迭代时会发生一些事情,它会生成int,然后导致struct.unpack失败。您可以在评论中看到我在哪里玩过铸造,引用,所有这些都没有用。我希望详细了解这里发生了什么,以便我可以正确编写更多3.5二进制操作。

失败的是value = struct.unpack('B', byte)[0]

我寻找解决方案的地方:

  • 阅读如何定义字节(你可以迭代,但是如何避开我)
  • 关于str->字节的大量讨论,反之亦然,无法解决这个问题
  • 阅读有关unpack如何工作的信息(unpack并不是要解包int,显然)
  • 从2.7转换为3x python
  • 此处处于stackoverflow

提前致谢。 这是代码:

def __computeChecksum(header, length, ensemble):
    """Compute a checksum from header, length, and ensemble"""
    # these print as a byte (b'\x7f\x7f' or b'\x7fy') at this point
    print(header)  # header is a bytes object
    cs = 0   
    # so, when the first byte of header is assigned to byte, it gets cast to int.  Why, and how to prevent this?
    for byte in header:
        print(byte) # this prints as an integer at this point, 127 = 0x7F because a bytes object is a "mutable sequence of integers"
        print(type(byte)) # here byte is an int - we need it to be a bytes object for unpack to work
        value = struct.unpack('B', byte)[0]  # this is the line that gets TypeError: a bytes-like object is required, not 'int'
        # this does not work either - from examples online I thought that referencing the first in the array was the problem
        #value = struct.unpack('B', byte)  # this is the line that gets TypeError: a bytes-like object is required, not 'int'
        # this does not work, the error is unpack requires a bytes object of lenth 1, so the casting happened
        #value = struct.unpack('B', bytes(byte))[0] 
        # and this got the error a bytes-like object is required, not 'int', so the [0] reference generates an int
        # value = struct.unpack('B', bytes(byte)[0])[0] 
        cs += value
    for byte in length:
        value = struct.unpack('B', byte)[0]
        cs += value
    for byte in ensemble:
        value = struct.unpack('B', byte)[0]
        cs += value
    return cs & 0xffff

# convenience function reused for header, length, and checksum
def __nextLittleEndianUnsignedShort(file):
    """Get next little endian unsigned short from file"""
    raw = file.read(2)
    """for python 3.5, struct.unpack('<H', raw)[0] needs to return a
       byte, not an int
       Note that it's not a problem here, but in the next cell, when a for loop is involved, we get an error
    """
    return (raw, struct.unpack('<H', raw)[0])

主程序中调用上述函数的代码

while (header == wavesId) or (header == currentsId):
    print('recnum= ',recnum)
    # get ensemble length
    rawLength, length = __nextLittleEndianUnsignedShort(rawFile)
    # read up to the checksum
    rawEnsemble = rawFile.read(length-4)
    # get checksum
    rawChecksum, checksum = __nextLittleEndianUnsignedShort(rawFile)

    computedChecksum = __computeChecksum(rawHeader, rawLength, rawEnsemble)

    if checksum != computedChecksum:
        raise IOError('Checksum error')

最后,错误的全文

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-5e60bd9b9a54> in <module>()
     13     rawChecksum, checksum = __nextLittleEndianUnsignedShort(rawFile)
     14 
---> 15     computedChecksum = __computeChecksum(rawHeader, rawLength, rawEnsemble)
     16 
     17     if checksum != computedChecksum:

<ipython-input-3-414811fc52e4> in __computeChecksum(header, length, ensemble)
     16        print(byte) # this prints as an integer at this point, 127 = 0x7F because a bytes object is a "mutable sequence of integers"
     17        print(type(byte)) # here byte is an int - weneed it to be a bytes object for unpack to work
---> 18        value = struct.unpack('B', byte)[0]  # this is the line that gets TypeError: a bytes-like object is required, not 'int'
     19        # this does not work either - from examples online I thought that referencing the first in the array was the problem
     20        #value = struct.unpack('B', byte)  # this is the line that gets TypeError: a bytes-like object is required, not 'int'

TypeError: a bytes-like object is required, not 'int'

完整的python笔记本在这里: https://gist.github.com/mmartini-usgs/4795da39adc9905f70fd8c27a1bba3da

2 个答案:

答案 0 :(得分:1)

最优雅的解决方案简直就是:

ensemble = infile.read(ensemblelength)

def __computeChecksum(ensemble):
    cs = 0    
    for byte in range(len(ensemble)-2):
        cs += ensemble[byte]
    return cs & 0xffff

答案 1 :(得分:0)

在不知道header是什么以及如何读取数据的情况下回答很复杂。理论上,如果你用rb(读二进制)来读它,那就不应该发生。 (实际上是在评论中。)

这是对问题的更好解释。

iterate over individual bytes in python3

我会使用if-clause来获取int,但你可以像在那个答案中那样重新转换为字节。另外,请查看numpy.fromfile。使用IMO更容易。

PS:这是一个很大的细节!如果您遵循SSCCE,您可能会获得更有意义的答案。并且您可以像往常一样将链接发布到完整的笔记本; - )

我会用你的评论重写你的问题,如:

在Python 3.x上迭代字节时,我得到的是int而不是字节。是否有可能获得所有字节?

In [0]: [byte for byte in b'\x7f\x7f']
Out[0]: [127, 127]