如何在文件中找到字节序列?

时间:2018-03-21 10:21:26

标签: python

我有一个二进制文件,我需要更改某些位。

该位的字节地址与某个字节序列(某些ASCII字符串)有关:

content = array('B')
with open(filename, mode="r+b") as file:
    content.fromfile(file, os.fstat(file.fileno()).st_size)
    abc = [ord(letter) for letter in "ABC"]
    i = content.index(abc) // ValueError: array.index(x): x not in list
    content[i + 0x16] |= 1
    content.tofile(file)

然而,我必须承认我的耻辱,在谷歌搜索广泛之后,我无法找到获取该指数的方法" ABC"串...

当然,我可以编写一个用循环来完成它的函数,但我不能相信没有任何单行程(好的,甚至两个......)来完成它。

怎么做?

1 个答案:

答案 0 :(得分:0)

不确定这是否是最恐怖的方式,但这有效。在此文件中

$ cat so.bin    
���ABC̻�X��w
$ hexdump so.bin
0000000 eeff 41dd 4342 bbcc 58aa 8899 0a77     
000000e

编辑:新解决方案从此处开始。

import string

char_ints = [ord(c) for c in string.ascii_letters]

with open("so.out.bin", "wb") as fo:
    with open("so.bin", "rb") as fi:

        # Read bytes but only keep letters.
        chars = []
        for b in fi.read():
            if b in char_ints:
                chars.append(chr(b))
            else:
                chars.append(" ")

        # Search for 'ABC' in the read letters.
        pos = "".join(chars).index("ABC")

        # We now know the position of the intersting byte.
        pos_x = pos + len("ABC") + 3 # known offset

        # Now copy all bytes from the input to the output, ...
        fi.seek(0)
        i = 0
        for b in fi.read():
            # ... but replace the intersting byte.
            if i == pos_x:
                fo.write(b"Y")
            else:
                fo.write(bytes([b]))
            i = i + 1

修改:新解决方案在此结束。

我想在X之后获得ABC个四个职位。一个小状态保持定位ABC的位置,跳过偏移量,打印有趣的字节。

foundA = False
foundB = False
foundC = False
found = False
offsetAfterC = 3
lengthAfterC = 1

with open("so.bin", "rb") as f:
    pos = 0
    for b in f.read():
        pos = pos + 1
        if not found:
            if b == 0x41:
                foundA = True
            elif foundA and b == 0x42:
                foundB = True
            elif foundA and foundB and b == 0x43:
                foundC = True
            else:
                foundA, foundB, foundC = False, False, False

        if foundA and foundB and foundC:
            found = True
            break

    f.seek(0)
    i = 0
    while i < pos + offsetAfterC:
        b = f.read(1)
        i = i + 1
    while i < pos + offsetAfterC + lengthAfterC:
        b = f.read(1)
        print(hex(int.from_bytes(b, byteorder="big")))
        i = i + 1

输出:

0x58