Question

我正在尝试使用Python读取和解析二进制文件。

问题是文件中的数据可以是little-endian或big-endian格式，也可以是32位或64位值。在文件头中有几个字节指定数据格式和大小。让我们假设我已经读过这些并且我知道格式和大小，并且我尝试构造一个格式字符串，如下所示：

    if (bitOrder == 1):      # little-endian format
        strData = '<'
    elif (bitOrder == 2):    # bit-endian format
        strData = '>'

    if (dataSize == 1):      # 32-bit data
        strLen = 'L'
    elif (dataSize == 2):
        strLen = 'q'

    strFormat = strData + strLen
    struct.unpack(strFormat, buf)

当我这样做时，我收到错误："struct.error: unpack requires a string argument of length 2"，但如果我写struct.unpack('<L', buf)，我会得到预期的结果。

在交互式shell上，如果我运行type(strFormat)，我会得到结果<type, 'str'>，当我运行len(strFormat)时，我会得到2的结果。

因此，对于Python来说相对较新，我有以下问题：

str与字符串不一样吗？如果没有，我如何在两者之间进行转换？
如何正确构建用于unpack函数的格式字符串？

------编辑------ 发表评论：

此时由于其他项目的限制，我正在使用python-2.7。

我正在努力避免发布我的代码（长达数百行），但这里是一个交互式python（从内部运行emacs，如果这很重要），它显示了我遇到的行为：

Python 2.7.5 (default, Jun 17 2014, 18:11:42) 
[GCC 4.8.2 20140120 (Red Hat 4.8.2-16)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> >>> >>> >>> 
>>> import array
>>> import struct
>>> header = array.array('B',[0x7f, 0x45, 0x4c, 0x46, 0x02, 0x01, 0x01, 0x00, 0x00, 0x00, 0x00,0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0x00,0x3e, 0x00, 0x01, 0x00, 0x00, 0x00, 0x40, 0x04, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x70, 0x11, 0x00, 0x00, 0x00,0x00, 0x00, 0x00, 0x00,0x00, 0x00, 0x00, 0x40, 0x00, 0x38, 0x00, 0x09, 0x00, 0x40, 0x00, 0x1e, 0x00, 0x1b, 0x00])
>>> entry = header[24:32]
>>> phoff = header[32:40]
>>> shoff = header[40:48]
>>> strData = '<'
>>> strLen = 'H'
>>> strFormat = strData + strLen
>>> print strFormat
<H
>>> type(strFormat)
<type 'str'>
>>> len(strFormat)
2
>>> temp = struct.unpack(strFormat, entry)
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
struct.error: unpack requires a string argument of length 2
>>>

修复了原始代码中的类型。

Answer 1

通过交互式会话，您的问题似乎是这样：

temp = struct.unpack(strFormat, entry)

早些时候，你说：

entry = header[24:32]

entry长度为8个字节，但strFormat表示长度应为2个字节。这就是struct抱怨的。

它也应该是bytes对象（2.x下的str），而不是array.array。

如何以编程方式为struct.unpack构造格式字符串？

1 个答案: