Question

我在将文件中的图像作为字符串加载时遇到问题。我需要在我的程序中使用的许多函数依赖于使用ascii编码的读取数据，它只是无法处理我给它的数据产生以下错误：

def text_to_bits(text, encoding='utf-8', errors='surrogatepass'):
    bits = bin(int(binascii.hexlify(text.encode(encoding, errors)), 16))[2:]
    return bits.zfill(8 * ((len(bits) + 7) // 8))

def str2int(string):
    binary = text_to_bits(string)
    number = int(binary, 2)
    return number

def go():
    #filen is the name of the file
    global filen
    #Reading the file
    content = str(open(filen, "r").read())
    #Using A function from above
    integer = str2int(content)
    #Write back to the file
    w = open(filen, "w").write(str(integer))

那我该如何将这些数据转换为ascii。

编辑：

这是我正在使用的公认的混乱代码。请不要评论它有多乱，这是草稿：

{{1}}

Answer 1

图像数据不是ASCII 。图像数据是二进制，因此使用ASCII标准不涵盖的字节。不要尝试将数据解码为ASCII。您还需要确保以二进制模式打开文件，以避免特定于平台的行分隔符翻译，这会损坏您的图像数据。

任何期望处理图像数据的方法都会处理二进制数据，而在Python 2中，这意味着您将把它作为str类型进行处理。

在您的特定情况下，您正在使用期望处理Unicode数据的函数，而不是二进制图像数据，并且它正在尝试将数据编码为二进制文件。换句话说，因为您正在为它提供已经是二进制（编码）的数据，所以该函数对已经二进制的数据应用Unicode的转换方法（以生成二进制表示）。然后，Python尝试首先解码，为您提供Unicode编码。隐含的解码在这里失败了：

>>> '\xa8'.encode('utf8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa8 in position 0: ordinal not in range(128)

请注意，我编码，但得到解码例外。

您使用的代码非常错综复杂。如果您想将文件的整个二进制内容解释为一个大整数，那么可以通过转换为十六进制表示来实现它，但是你不会转换为二进制字符串并再次返回。以下就足够了：

with open(filename, 'rb') as fileobj:
    binary_contents = fileobj.read()
    integer_value = int(binascii.hexlify(binary_contents), 16)

然而，图像数据不会被无限解释为一个长号。二进制数据可以对整数进行编码，但在处理图像时，您通常使用struct module来解码来自特定字节的特定整数值。

使用ascii编码从文件中读取图像

1 个答案: