此代码在python 2.7中运行,在3.5中失败。我想将其转换为3.5。我被困在使用for循环的行为改变数据类型的地方。我是一个熟练的程序员,对python来说比较新,所以这可能很明显,而且我的google-foo一直未能找到这个确切的例子或解决方案。所以我们走了:
以下是此代码中的代码段,适用于2.7: http://trac.nccoos.org/dataproc/browser/DPWP/trunk/DPWP/ADCP_splitter/pd0.py pd0.py打开一个二进制输入流,查找标识字节的记录类型,并将数据分成两个包含相应数据的独立文件,所有文件都是二进制文件。
在下面的代码块中,header,length和ensemble都是字节对象。在python 3.5中,当for循环迭代时会发生一些事情,它会生成int,然后导致struct.unpack失败。您可以在评论中看到我在哪里玩过铸造,引用,所有这些都没有用。我希望详细了解这里发生了什么,以便我可以正确编写更多3.5二进制操作。
失败的是value = struct.unpack('B', byte)[0]
我寻找解决方案的地方:
提前致谢。 这是代码:
def __computeChecksum(header, length, ensemble):
"""Compute a checksum from header, length, and ensemble"""
# these print as a byte (b'\x7f\x7f' or b'\x7fy') at this point
print(header) # header is a bytes object
cs = 0
# so, when the first byte of header is assigned to byte, it gets cast to int. Why, and how to prevent this?
for byte in header:
print(byte) # this prints as an integer at this point, 127 = 0x7F because a bytes object is a "mutable sequence of integers"
print(type(byte)) # here byte is an int - we need it to be a bytes object for unpack to work
value = struct.unpack('B', byte)[0] # this is the line that gets TypeError: a bytes-like object is required, not 'int'
# this does not work either - from examples online I thought that referencing the first in the array was the problem
#value = struct.unpack('B', byte) # this is the line that gets TypeError: a bytes-like object is required, not 'int'
# this does not work, the error is unpack requires a bytes object of lenth 1, so the casting happened
#value = struct.unpack('B', bytes(byte))[0]
# and this got the error a bytes-like object is required, not 'int', so the [0] reference generates an int
# value = struct.unpack('B', bytes(byte)[0])[0]
cs += value
for byte in length:
value = struct.unpack('B', byte)[0]
cs += value
for byte in ensemble:
value = struct.unpack('B', byte)[0]
cs += value
return cs & 0xffff
# convenience function reused for header, length, and checksum
def __nextLittleEndianUnsignedShort(file):
"""Get next little endian unsigned short from file"""
raw = file.read(2)
"""for python 3.5, struct.unpack('<H', raw)[0] needs to return a
byte, not an int
Note that it's not a problem here, but in the next cell, when a for loop is involved, we get an error
"""
return (raw, struct.unpack('<H', raw)[0])
主程序中调用上述函数的代码
while (header == wavesId) or (header == currentsId):
print('recnum= ',recnum)
# get ensemble length
rawLength, length = __nextLittleEndianUnsignedShort(rawFile)
# read up to the checksum
rawEnsemble = rawFile.read(length-4)
# get checksum
rawChecksum, checksum = __nextLittleEndianUnsignedShort(rawFile)
computedChecksum = __computeChecksum(rawHeader, rawLength, rawEnsemble)
if checksum != computedChecksum:
raise IOError('Checksum error')
最后,错误的全文
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-6-5e60bd9b9a54> in <module>()
13 rawChecksum, checksum = __nextLittleEndianUnsignedShort(rawFile)
14
---> 15 computedChecksum = __computeChecksum(rawHeader, rawLength, rawEnsemble)
16
17 if checksum != computedChecksum:
<ipython-input-3-414811fc52e4> in __computeChecksum(header, length, ensemble)
16 print(byte) # this prints as an integer at this point, 127 = 0x7F because a bytes object is a "mutable sequence of integers"
17 print(type(byte)) # here byte is an int - weneed it to be a bytes object for unpack to work
---> 18 value = struct.unpack('B', byte)[0] # this is the line that gets TypeError: a bytes-like object is required, not 'int'
19 # this does not work either - from examples online I thought that referencing the first in the array was the problem
20 #value = struct.unpack('B', byte) # this is the line that gets TypeError: a bytes-like object is required, not 'int'
TypeError: a bytes-like object is required, not 'int'
完整的python笔记本在这里: https://gist.github.com/mmartini-usgs/4795da39adc9905f70fd8c27a1bba3da
答案 0 :(得分:1)
最优雅的解决方案简直就是:
ensemble = infile.read(ensemblelength)
def __computeChecksum(ensemble):
cs = 0
for byte in range(len(ensemble)-2):
cs += ensemble[byte]
return cs & 0xffff
答案 1 :(得分:0)
在不知道header
是什么以及如何读取数据的情况下回答很复杂。理论上,如果你用rb
(读二进制)来读它,那就不应该发生。 (实际上是在评论中。)
这是对问题的更好解释。
iterate over individual bytes in python3
我会使用if-clause来获取int,但你可以像在那个答案中那样重新转换为字节。另外,请查看numpy.fromfile
。使用IMO更容易。
我会用你的评论重写你的问题,如:
在Python 3.x上迭代字节时,我得到的是int而不是字节。是否有可能获得所有字节?
In [0]: [byte for byte in b'\x7f\x7f']
Out[0]: [127, 127]