在python中将二进制转换为utf-8

时间:2013-10-08 18:48:34

标签: python string utf-8 binary converter

我有这样的二进制文件:     1101100110000110110110011000001011011000101001111101100010101000

我希望将其转换为utf-8。 我怎么能在python中做到这一点?

4 个答案:

答案 0 :(得分:10)

清洁版:

>>> test_string = '1101100110000110110110011000001011011000101001111101100010101000'
>>> print ('%x' % int(test_string, 2)).decode('hex').decode('utf-8')
نقاب

反向(来自@Robᵩ的评论):

>>> '{:b}'.format(int(u'نقاب'.encode('utf-8').encode('hex'), 16))
1: '1101100110000110110110011000001011011000101001111101100010101000'

答案 1 :(得分:4)

嗯,我的想法是:  1.将字符串拆分为八位字节  2.使用int和更晚chr将八位字节转换为十六进制  3.加入它们并将ut​​f-8字符串解码为Unicode

此代码适用于我,但我不确定它是什么打印因为我的控制台中没有utf-8(Windows:P)。

s = '1101100110000110110110011000001011011000101001111101100010101000'
u = "".join([chr(int(x,2)) for x in [s[i:i+8] 
                           for i in range(0,len(s), 8)
                           ]
            ])
d = u.decode('utf-8')

希望这有帮助!

答案 2 :(得分:3)

>>> s='1101100110000110110110011000001011011000101001111101100010101000'
>>> print (''.join([chr(int(x,2)) for x in re.split('(........)', s) if x ])).decode('utf-8')
نقاب
>>> 

或者反过来:

>>> s=u'نقاب'
>>> ''.join(['{:b}'.format(ord(x)) for x in s.encode('utf-8')])
'1101100110000110110110011000001011011000101001111101100010101000'
>>> 

答案 3 :(得分:1)

使用:

def bin2text(s): return "".join([chr(int(s[i:i+8],2)) for i in xrange(0,len(s),8)])


>>> print bin2text("01110100011001010111001101110100")
>>> test