Question

我有一个二进制文件，首先包含列出coloumn类型然后数据的标题。一个典型的例子如下：

8Byte for a double，8Byte double，4Byte for a int32。

该序列重复几次（10k至20M次）。这可以通过以下方式轻松阅读：

numpy.fromfile(file_id, dtype = (('A': '<f8'), ('A': '<f8'), ('A': '<f8', count = n_repetitions)

但是知道我有一个类似的序列：

8Byte for a double，4Byte for a int32 defining the next char field length，char's of lenght defined before，8Byte double，4Byte for a int32

由于这个原因，我不能使用S#，因为列表中的每个元素的字段长度都不相同。是否有更好的方法来读取文件而不是逐行迭代？解决方案不一定必须是numpy，而应由python

调用

Answer 1

查看BitString：

# example taken from [illuminate][2] 
from bitstring import BitString

bs = BitString(bytes=open(bitstring_or_filename, 'rb').read())
recordlen = bs.read('uintle:8')  # length of each record

# 206 * 8 = 1648 record  length in bits
for i in range(0,((bs.len) / (recordlen * 8))):
    # ....

从二进制文件中读取变长字符串

1 个答案: