如何将二进制代码解码为文本?

时间:2015-01-14 11:51:22

标签: python python-2.7 binary

就像一个充满乐趣的项目一样,我想用Python构建一个简单的二进制编码器。在那之后工作得非常好,我继续升级它作为编码器和解码器...突然,它似乎不起作用(只有第二个选项,第一个选项仍然正常)。

我想要解码时遇到的错误'0100 0001'代表"A",如下所示:

Your message to decode: 0100 0010
KeyError                                  Traceback (most recent call last)
C:\Users\marco\AppData\Local\Enthought\Canopy32\App\appdata\canopy-1.4.0.1938.win-x86\lib\site-packages\IPython\utils\py3compat.pyc in execfile(fname, glob, loc)
    195             else:
    196                 filename = fname
--> 197             exec compile(scripttext, filename, 'exec') in glob, loc
    198     else:
    199         def execfile(fname, *where):

C:\Users\marco\Dropbox\1_TUDelft\4Q\AE1205 Python\my own codes\binary encoder.py in <module>()
     41     messageDecode = raw_input("Your message to decode: ")
     42     for character in messageDecode:
---> 43         print inverseBINARY[character],

KeyError: '0' 

我怀疑它是最后一个命令,print命令,但是我不知道如何纠正它......有什么建议吗?

以下是代码:

BINARY = {"A":"0100 0001",
"B":"0100 0010",
"C":"0100 0011",
"D":"0100 0100",
"E":"0100 0101",
"F":"0100 0110",
"G":"0100 0111",
"H":"0100 1000",
"I":"0100 1001",
"J":"0100 1010",
"K":"0100 1011",
"L":"0100 1100",
"M":"0100 1101",
"N":"0100 1110",
"O":"0100 1111",
"P":"0101 0000",
"Q":"0101 0001",
"R":"0101 0010",
"S":"0101 0011",
"T":"0101 0100",
"U":"0101 0101",
"V":"0101 0110",
"W":"0101 0111",
"X":"0101 1000",
"Y":"0101 1001",
"Z":"0101 1010",
" ":"0100 0000",
".":"0010 1110",
",":"0010 1100",
"?":"0011 1111"}

inverseBINARY = {v:k for k,v in BINARY.items()}

question = input("Do you wish to encode(press 1) or decode(press 2) into/from binary?")

if question == 1:
    messageEncode = raw_input("Your message to encode: ")
    for character in messageEncode:
        print BINARY[character.upper()],

if question == 2:
    messageDecode = raw_input("Your message to decode: ")
    for character in messageDecode:
        print inverseBINARY[character],    

3 个答案:

答案 0 :(得分:3)

您正在循环输入消息的个别字符,但您需要寻找9个字符的组(2个4位二进制数字和空格)。您的映射包含'0100 1001',而不是'0''1'以及' '

等密钥

最简单的方法(虽然有点脆弱)是以10个字符为单位循环索引(字符之间的空格为1个额外),然后抓取9个字符:

for i in xrange(0, len(messageDecode), 10):
    group = messageDecode[i:i + 9]
    print inverseBINARY[group],    

xrange()对象产生10个整数;所以01020等等messageDecode字符串然后切片从该索引开始抓取9个字符,所以{{ 1}}和messageDecode[0:9]messageDecode[10:19]

更强大的方法是删除所有 8 字符的所有空格和抓取块;这样就留出了额外空间的空间,但你必须重新插入那个空间以匹配你的钥匙:

messageDecode[20:29]

或者您可能不会在messageDecode = messageDecode.replace(' ', '') for i in xrange(0, len(messageDecode), 8): group = messageDecode[i:i + 4] + ' ' + messageDecode[i + 4:i + 8] print inverseBINARY[group], 映射中包含空格:

inverseBINARY

然后简单地切每8个字符:

inverseBINARY = {v.replace(' ', ''): k for k, v in BINARY.items()}

答案 1 :(得分:-1)

如果要解码二进制文件,为什么不使用本机函数作为二进制数和chr

>>> print chr(0b01000010)
B

修改

好的,这就是我要解决的问题:

from string import letters, punctuation
encode_data = {letter:bin(ord(letter)) for letter in letters+punctuation+' '}
decode_data = {bin(ord(letter)):letter for letter in letters+punctuation+' '}

def encode(message):
    return [encode_data[letter] for letter in message]

def decode(table):
    return [decode_data[item] for item in table]

encoded = encode('hello there')
print decode(encoded) # ['h', 'e', 'l', 'l', 'o', ' ', 't', 'h', 'e', 'r', 'e']

答案 2 :(得分:-1)

将ascii转换为二进制文件:

>>> format(ord('A'), 'b')
'1000001'

将二进制转换为ascii:

>>> chr(int('1000001',2))
'A'

这是您的代码的更紧凑版本:

question = raw_input("Your message to encode/decode: ")

try:
    question = int(question, 2) # Checks if inptu is binary.
    print 'Decoding...'
    print chr(question)
except:
    print 'Encoding...'
    print "".join([format(ord(i), 'b') for i in question])

[测试]:

alvas@ubi:~$ python test.py 
Your message to encode/decode: 1000001
Decoding...
A
alvas@ubi:~$ python test.py 
Your message to encode/decode: A
Encoding...
1000001