Question

我有一个二进制文件，大小有几百MB。它包含float32 big-endian格式的样本（每个样本4个字节）。我想将它们转换为little-endian格式。一些背景知识：我想稍后将它们写入.wav文件，并且需要小端格式的数据。

以下代码是我目前使用的代码。它似乎工作正常，但速度很慢（我假设因为我一次只写4个字节）：

import struct

infile = "infile_big_endian.raw"
outfile = "outfile_little_endian.raw"

with open(infile, "rb") as old, open(outfile , "wb") as new:
    for chunk in iter(lambda: old.read(4), b""):
        chunk = struct.pack("<f", struct.unpack(">f", chunk)[0])
        new.write(chunk)

在python中有更快的方法吗？

Answer 1

NumPy可能会更快：

numpy.memmap(infile, dtype=numpy.int32).byteswap().tofile(outfile)

或覆盖输入文件：

numpy.memmap(infile, dtype=numpy.int32).byteswap(inplace=True).flush()

我们对数组进行内存映射，并使用byteswap以C速度反转字节序。我已使用int32代替float32，以防NaN可能是float32的问题。

在二进制文件中快速反转float32字节顺序的方法

1 个答案: