如何从二进制文件中读取信息

时间:2011-07-20 11:55:03

标签: python

我有二进制文件和规范:

after 'abst' (0x61627374):
var1  Unsigned 8-bit integer
var2 Unsigned 24-bit integer
var3 Sequence of Unicode 8-bit characters (UTF-8), terminated with 0x00 

如何从文件中读取var1,var2,var3?

2 个答案:

答案 0 :(得分:1)

快速而肮脏且未经测试:

# assumption: the file is small enough to fit into the RAM
# and also that 'abst' does not occur in the dataset
for hunk in input.split('abst')[1:]: # skip first hunk, since it is the stuff befor the first 'abst' occurence
    var1 = ord(hunk[0])
    var2 = ord(hunk[1]) + ord(hunk[2])*256 + ord(hunk[3])*256*256
    var3 = hunk[4:].split('\x00')[0]

答案 1 :(得分:0)

bitstring模块在​​这里可能会有所帮助,因为您有不寻常的位长,并且它比手动解包值更具可读性:

import bitstring
bitstring.bytealigned = True
s = bitstring.ConstBitStream(your_file)
if s.find('0x61627374'): # seeks to your start code
    start_code, var1, var2 = s.readlist('bytes:4, uint:8, uint:24')
    p1 = s.pos
    p2 = s.find('0x00', start=p1) # find next '\x00'
    var3 = s[p1:p2+8].bytes       # and interpret the slice as bytes