Question

在Python 3中，假设我有

>>> thai_string = 'สีเ'

使用encode给出

>>> thai_string.encode('utf-8')
b'\xe0\xb8\xaa\xe0\xb8\xb5'

我的问题：如何使用encode()而不是bytes让\u返回\x序列？我怎样才能decode他们回到Python 3 str类型？

我尝试使用ascii内置版，它提供了

>>> ascii(thai_string)
"'\\u0e2a\\u0e35'"

但这似乎不太正确，因为我无法解码它以获得thai_string。

Python documentation告诉我

\xhh在

hh

\uxxxx使用16位十六进制值xxxx

文档说\u仅用于字符串文字，但我不确定这意味着什么。这是暗示我的问题有一个有缺陷的前提吗？

Answer 1

您可以使用define(function(require, exports, module) { "use strict"; var oop = require("../lib/oop"); var TextHighlightRules = require("./text_highlight_rules").TextHighlightRules; var MyHighlightRules = function() { var functions = [ "function" ]; this.$rules = { "start" : [ { token : 'keyword', regex : '\\b(?:' + functions.join('|') + ')(?=\\s*[:(])', push : [ { include : 'function' }, ] } ], // A function call 'function' : [ { token : 'paren', regex : /(?:[:(])/, }, { token : 'paren', regex : /(?:\)|$|^)/, next : 'pop' }, { include : 'commaList' }, ], // A series of arguments, separated by commas 'commaList' : [ { token : 'text', regex : /\s+/, }, { token : 'string.start', regex : /"/, push : 'string', }, { include : "variableName" } ], 'variableName' : [ { token : 'variable.parameter', regex : /[a-z][a-zA-Z0-9_.]*/ }, ], 'string': [ { token : 'string.end', regex : /"/, next : 'pop' }, { defaultToken : 'string.quoted' } ], }; this.normalizeRules(); }; oop.inherits(MyHighlightRules, TextHighlightRules); exports.MyHighlightRules = MyHighlightRules; });：

unicode_escape

请注意，>>> thai_string.encode('unicode_escape') b'\\u0e2a\\u0e35\\u0e40'将始终返回字节字符串（字节）和encode()编码is intended to：

在Python源代码中生成一个适合作为Unicode文字的字符串

如何使用\ u转义码编码Python 3字符串？

1 个答案: