Question

以下是我的问题重新措辞

读取二进制文件的前10个字节（稍后操作） -

infile = open('infile.jpg', 'rb')
outfile = open('outfile.jpg', 'wb')
x = infile.read(10)
for i in x:
    print(i, end=', ')
print(x)
outfile.write(bytes(x, "UTF-8"))

第一个印刷语句给出了 -

255, 216, 255, 224, 0, 16, 74, 70, 73, 70,

第二个印刷语句给出了 -

b'\xff\xd8\xff\xe0\x00\x10JFIF'

x中值的十六进制解释。

outfile.write(bytes(x, "UTF-8"))

返回 -

TypeError: encoding or errors without a string argument

那么x必须不是普通字符串，而是字节字符串，它仍然是可迭代的？

如果我想将x的内容写入outfile.jpg，那么我就去 -

outfile.write(x)

现在我尝试取每个x [i]并对每个执行一些操作（下面显示为1的骨骼简单乘积），将值赋给y并将y写入outfile.jpg，使其与infile相同.JPG。所以我试试 -

infile = open('infile.jpg', 'rb')
outfile = open('outfile.jpg', 'wb')
x = infile.read(10)

yi = len(x)
y = [0 for i in range(yi)]

j = 0
for i in x:
    y [j] = i*1
    j += 1

for i in x:
    print(i, end=', ')

print(x)

for i in y:
    print(i, end=', ')

print(y)

print(repr(x))
print(repr(y))

outfile.write(y)

第一个print语句（迭代x）给出 -

255, 216, 255, 224, 0, 16, 74, 70, 73, 70,

第二个印刷语句给出了 -

b'\xff\xd8\xff\xe0\x00\x10JFIF'

第三个打印语句（迭代y）给出 -

255, 216, 255, 224, 0, 16, 74, 70, 73, 70,

print语句给出 -

[255, 216, 255, 224, 0, 16, 74, 70, 73, 70]

最后，按照蒂姆的建议，印刷repr（x）和repr（y）分别给出了 -

b'\xff\xd8\xff\xe0\x00\x10JFIF'
[255, 216, 255, 224, 0, 16, 74, 70, 73, 70]

文件写入语句给出了错误 -

TypeError: 'list' does not support the buffer interface

我需要的是与x相同的类型，以便outfile.write（x）= outfile.write（y）

我盯着Python的眼睛，但我仍然没有看到它的灵魂。

Answer 1

它们完全不相同 - 在str()应用于print()后，它们只是显示（repr()隐式执行）。打印它们的>>> x = b'ab' >>> y = "b'ab'" >>> print(x) b'ab' >>> print(y) # displays identically b'ab' >>> print(repr(x)) # but x is really a 2-byte bytes object b'ab' >>> print(repr(y)) # and y is really a 5-character string "b'ab'"，您将看到差异。例如：

bytes

混合字符串和字节对象没有意义（好吧，不是没有明确的编码 - 但你不是在这里尝试编码/解码任何东西，对吧？）。如果您正在处理二进制文件，那么您根本不应该使用字符串 - 您应该使用bytearray或dummy_jpg = b'\x01\x02\xff'个对象。

所以问题并不在于你的写作方式：在此之前，逻辑基本上是混乱的。

无法猜出你想要什么。请编辑问题以显示您正在尝试完成的内容的完整可执行示例。我们不需要JPG文件 - 组成一些简短的任意二进制数据。像：

{{1}}

Answer 2

...这就是你在二进制模式下读取和写入Python文件的方式。

#open binary files infile and outfile
infile = open('infile.jpg', 'rb')
outfile = open('outfile.jpg', 'wb')

#n = bytes to read
n=5

#read bytes of infile to x
x = infile.read(n)

#print x type, x
print()
print('x = ', repr(x), type(x))
print()

x = b'\ xff \ xd8 \ xff \ xe0 \ x00'class'bytes'

#define y of type list, lenth xi, type list
xi = len(x)
y = [0 for i in range(xi)]

#print y type, y
print('y =', repr(y), type(y))
print()

y = [0,0,0,0,0] class'list'

#convert x to 8 bit octals and place in y, type list
j=0
for i in x:
    y [j] = '{:08b}' .format(ord(i))
    j += 1

#print y type, and y
print('y =', repr(y), type(y))
print()

y = ['11111111'，'11011000'，'11111111'，'11100000'，'00000000'] class'list'

#perform bit level operations on y [i], not done in this example.

#convert y [i] back to integer
j=0
for i in y:
    y [j] = int(i, 2)
    j += 1

#print y type, and y
print('y =', repr(y), type(y))
print()

y = [255,216,255,224,0] class'list'

#convert y to type byte and place in z
z = bytearray(y)

#print z type, and z
print('z =', repr(z), type(z))
print()

z = bytearray（b'\ xff \ xd8 \ xff \ xe0 \ x00'）class' bytearray'

#output z to outfile
outfile.write(z)

infile.close()
outfile.close()
outfile = open('outfile.jpg', 'rb')

#read bytes of outfile to x
x = outfile.read(n)

#print x type, and x
print('x =', repr(x), type(x))
print()

x = b'\ xff \ xd8 \ xff \ xe0 \ x00'class'bytes'

#conclusion:  first n bytes of infile = n bytes of outfile (without bit level operations)

outfile.close()

Answer 3

感谢您的澄清！您想要的非常简单，但您确实需要阅读bytes和bytearray类型的文档。你不想想要的是与之有关的任何事情：

的Unicode
字符串
编码
解码

这些都完全无关紧要。您从头到尾都有二进制数据，需要坚持使用bytes和/或bytearray个对象。两者都是字节序列（＆＃34;小整数＆＃34;在range(256)中）; bytes是一个不可变序列，bytearray是一个可变序列。

那么x必须不是普通字符串，而是字节字符串，它仍然是可迭代的？

阅读文档;-) x不是＆＃34;字符串＆＃34 ;;这样做是为了看它的类型：

print(type(x))

将显示：

<class 'bytes'>

它是一个bytes对象，正如已经简要解释过的那样。它是一个序列，所以是的，它可以迭代，就像所有序列一样。你也可以索引它，切片等等。

您的y是一个列表。唉，我无法弄清楚你用它来实现的目标。

我需要的是与x相同的类型，以便outfile.write（x）= outfile.write（y）

不，您不需要x和y属于同一类型。您做希望能够将y写为二进制数据。为此，您需要创建bytes 或 bytearray对象。这很容易;只做其中一个：

 y = bytes(y)

或

 y = bytearray(y)

然后

outfile.write(y)

会做你想做的事。

虽然如上所述，我不知道为什么你在这里开始创建一个列表。创建相同列表的一种更简单的方法是跳过所有循环，然后写：

 y = list(x)

如果我能够通过，你应该开始怀疑你在这里发生的事情的心理模型太复杂，而不是太简单。您正在想象不存在的困难:-)从二进制文件中读取会为您提供bytes对象（如果您想要读取二进制文件，请查看文件.readinto()方法要填充bytearray对象，而写入二进制文件需要为其写一个bytes或bytearray对象。这就是它的全部内容。

Python读取和写入二进制文件

3 个答案: