Question

我正在使用python3以二进制模式打开图像然后将数据拆分到特定标记（\ xff \ xda）

该标记之后的所有内容都存储在变量中我想将所有的a替换为e的

但是将二进制数据转换为字符串时遇到麻烦：

UnicodeDecodeError：“ ascii”编解码器无法解码位置中的字节0xe6 13：序数不在范围内（128）

with open(filein, "rb") as rd:
  with open(fileout,'wb') as wr:
    img = rd.read()
    if img.find(b'\xff\xda'): ## ff da start of scan
        splitimg = img.split(b'\xff\xda', 1)
        wr.write(splitimg[0])
        scanimg = splitimg[1]

        scanglitch = ""
        scanimg = scanimg.encode()

        for letter in scanimg :
            if letter not in 'a': 
                scanglitch += letter
            else :
                scanglitch += 'e'

    print(scanimg)

    wr.write(b'\xff\xda')
    content = scanglitch.decode()
    wr.write(content)

不是正确的encode（）和decode（）将二进制数据转换为字符串并返回？

Answer 1

在处理二进制数据时，您将尝试尽可能地保持二进制模式，尤其是因为不能保证您选择的字符串编码仍然可以表示所有值。

请记住，bytes对象基本上是8位无符号整数的列表，即使它们具有方便的类似于字符串的b'xyz'语法。

filein = "download.jpeg"
fileout = "glitch.jpg"

with open(filein, "rb") as rd:
    img = rd.read()
    # We can happily crash here if there's no FFDA; 
    # that means we're not able to process the file anyway
    prelude, marker, scanimg = img.partition(b"\xff\xda")
    scanglitch = []

    for letter in scanimg:  # scanimg is a list of integers, so we have to use `ord()`
        if letter != ord("a"):
            scanglitch.append(letter)
        else:
            scanglitch.append(ord("e"))

with open(fileout, "wb") as wr:
    wr.write(prelude)
    wr.write(marker)
    wr.write(bytes(scanglitch))

（我知道替换逻辑可以写为列表理解，但我认为这样会更友好。）

python3将二进制数据转换为字符串并返回

1 个答案: