Question

我有一个十六进制的字符串：

Hex = 'E388854083969497A4A38599408881A2409985829696A38584408699969440814082A48783888583924B'

作为字节对象，它看起来像这样：

b'\xe3\x88\x85@'b'\xe3\x88\x85@\x83\x96\x94\x97\xa4'b'\xe3\x88\x85@'b'\xe3\x88\x85@\x83\x96\x94\x97\xa4'b'\xe3\x88\x85@\x83'b'\xe3\x88'b'\xe3\x88\x85@\x83\x96\x94\x97\xa4'

在EBCDIC中就是这样：

The computer has rebooted from a bugcheck.

所以我知道hex 40（x40）是一个空间＆＃39;在EBCDIC及其中＆＃39; @＆＃39;用ASCII

我无法理解为什么python在打印字节对象时打印＆＃39; @＆＃39;而不是＆＃39; \ x40＆＃39;

我的测试代码示例是：

import codecs
Hex = 'E388854083969497A4A38599408881A2409985829696A38584408699969440814082A48783888583924B'

output = []
DDF = [4,9,4,9,5,2,9]
distance = 0

# This breaks my hex string into chunks based off the list 'DDF'
for x in DDF:
    output.append(Hex[distance:x*2+distance])
    distance += x*2

#This prints out the list of hex strings
for x in output:
    print(x)

#This prints out they byte objects in the list
for x in output:
    x = codecs.decode(x, "hex")
    print(x)

#The next line print the correct text
Hex = codecs.decode(Hex, "hex")
print(codecs.decode(Hex, 'cp1140'))

以上的输出是：

E3888540
83969497A4A3859940
8881A240
9985829696A3858440
8699969440
8140
82A48783888583924B
b'\xe3\x88\x85@'
b'\x83\x96\x94\x97\xa4\xa3\x85\x99@'
b'\x88\x81\xa2@'
b'\x99\x85\x82\x96\x96\xa3\x85\x84@'
b'\x86\x99\x96\x94@'
b'\x81@'
b'\x82\xa4\x87\x83\x88\x85\x83\x92K'
The computer has rebooted from a bugcheck.

所以我想我的问题是如何让python将字节对象打印为＆＃39; x40＆＃39;而不是＆＃39; @＆＃39;

非常感谢你的帮助:)。

Answer 1

当通过print()打印时，Python总是尝试首先将十六进制解码为可打印（读取：ASCII）字符。如果您需要打印完整的十六进制字符串，请使用binascii.hexlify()：

Hex = 'E388854083969497A4A38599408881A2409985829696A38584408699969440814082A48783888583924B'

binascii.hexlify(codecs.decode(Hex,'hex'))

>>>> b'e388854083969497a4a38599408881a2409985829696a38584408699969440814082a48783888583924b'

Answer 2

我认为你的字节数组略有偏差。

根据this，你需要使用＆＃39; cp500＆＃39;用于解码，例如：

my_string_in_hex = 'E388854083969497A4A38599408881A2409985829696A38584408699969440814082A48783888583924B'
my_bytes = bytearray.fromhex(my_string_in_hex)
print(my_bytes)

my_string = my_bytes.decode('cp500')
print(my_string)

输出：

bytearray(b'\xe3\x88\x85@\x83\x96\x94\x97\xa4\xa3\x85\x99@\x88\x81\xa2@\x99\x85\x82\x96\x96\xa3\x85\x84@\x86\x99\x96\x94@\x81@\x82\xa4\x87\x83\x88\x85\x83\x92K')
The computer has rebooted from a bugcheck.

当你打印bytearray时，它仍然会打印一个＆＃39; @＆＃39;但它是真实的\ x40＆＃34;在封面下＃34;。这只是对象的__repr__()。由于这种方法没有采取任何＆＃34;解码＆＃34;正确解码它的参数，它只是创建一个＆＃34;可读＆＃34;字符串用于打印目的。

__repr__()或repr()是＆＃34; 只是＆＃34 ;;它只是对象的<＃34; 表示＆＃34;不是实际的对象。这并不意味着它实际上是一个＆＃39; @＆＃39;。我只是在打印时使用该字符。它仍然是一个bytearray，而不是一个字符串。

解码时，使用所选的代码页正确解码。

EBCDIC

2 个答案: