Question

是否有内置方式来转换＆＃34;字节串到unicode字符串？我不想解码它，我希望我在打印时看到的字符串没有＆＃34; b＆＃34;。

e.g。输入：

$("#dup-week-schedule").click(function(){
    var currentWeek = $.fullCalendar.formatDate($('#calendar').fullCalendar('getDate'), "W");
    var currentWeekArray = [];
    var currentWeekTitleArray = [];        $.each($('#calendar').fullCalendar('clientEvents'), function(i,v){
      if ((v['start'].isoWeek()) == currentWeek){
        currentWeekArray.push(v);
        currentWeekTitleArray.push(v['title']);
      }
    });
}

输出：

b'\xb5\xb5\xb5\xb5\r\n1'

我已尝试迭代字节字符串，但这给了我一个整数列表：

'\xb5\xb5\xb5\xb5\r\n1'

我明白了：

my_bytestring = b'%PDF-1.4\n%\x93\x8c\x8b\x9e'

my_string = ""
my_list = []
for char in my_bytestring:
    my_list.append(char)
    my_string += str(char)
print(my_list)   # -> list of ints
print(my_string) # -> string of converted ints

我想：

[37, 80, 68, 70, 45, 49, 46, 52, 10, 37, 147, 140, 139, 158]

Answer 1

使用[Python]: chr(i)功能：

>>> b = b"\xb5\xb5\xb5\xb5\r\n1"
>>> s = "".join([chr(i) for i in b])
>>> s
'µµµµ\r\n1'
>>> len(b), len(s)
(7, 7)

正如@hop所提到的，最好使用这种方法：

>>> s0 = b.decode(encoding="unicode_escape")
>>> s0
'µµµµ\r\n1'
>>> len(s0)
7

但是，看看你的2 ^nd示例，您似乎需要[Python]: repr(object)：

>>> my_bytestring = b'%PDF-1.4\n%\x93\x8c\x8b\x9e'
>>> l = [i for i in repr(my_bytestring)][2:-1]
>>> l
['%', 'P', 'D', 'F', '-', '1', '.', '4', '\\', 'n', '%', '\\', 'x', '9', '3', '\\', 'x', '8', 'c', '\\', 'x', '8', 'b', '\\', 'x', '9', 'e']
>>> len(my_bytestring), len(l)
(14, 27)

Answer 2

从技术上讲，你不能在没有解码的情况下从字节到字符串，但是有一个编解码器可以做你想要的：

>>> b = b'\xb5\xb5\xb5\xb5\r\n1'
>>> s = b.decode('unicode_escape')
>>> s
'µµµµ\r\n1'
>>> print(s)
µµµµ
1

还有raw_unicode_escape。您可以在documentation

中了解差异

我非常怀疑在unicode字符串中有二进制数据的用例。

Python 3：将bytestring表示为字符串（不解码）

2 个答案: