Question

我需要读取二进制文件中某个字符串的点，然后对后面的字节进行操作。字符串是'colr'（这是一个JPEG 2000文件），这是我到目前为止所拥有的：

from collections import deque

f = open('my.jp2', 'rb')
bytes =  deque([], 4)
while ''.join(map(chr, bytes)) != 'colr':
    bytes.appendleft(ord(f.read(1)))

如果有效：

bytes =  deque([0x63, 0x6F, 0x6C, 0x72], 4)
print ''.join(map(chr, bytes))

（返回'colr'），我不知道为什么我的循环中的测试永远不会评估为True。我结束了旋转 - 只是悬挂 - 当我读完整个文件时，我甚至都没有退出。

Answer 1

将您的bytes.appendleft()更改为bytes.append()然后它会起作用 - 这对我有用。

Answer 2

  with open("my.jpg","rb") as f:
       print f.read().split("colr",1)

如果你不想一次阅读全部......那么

def preprocess(line):
    print "Do Something with this line"
def postprocess(line):
    print "Do something else with this line"
currentproc = preprocess
with open("my.jpg","rb") as f:
   for line in f:
       if "colr" in line:
           left,right = line.split("colr")
           preprocess(left)
           postprocess(right) 
           currentproc= postprocess
        else:
           currentproc(line)

逐行而不是逐字节......但是...... 我很难想到你没有足够的内存来将整个jpg保存在内存中... python并不是一种非常棒的语言来减少内存或时间脚印但功能要求很棒：）

Python - 从二进制文件中读取字符串

2 个答案: