当我运行此Python 2.7代码时(编辑:更新了代码)
import io
x = io.StringIO(u'\ud801')
CPython运行正常,但是IronPython抛出以下错误:
UnicodeEncodeError:
Unable to translate Unicode character \uD801 at index 0 to specified code page.
我认为这是因为U+D801 is an unpaired surrogate and thus an invalid character,但是哪个实现在这里显示正确的行为?该代码应该抛出还是不抛出?
答案 0 :(得分:0)
它们都是正确的,但是没有做相同的事情。 IronPython似乎正在尝试print
Unicode字符,但未能将其转换为当前代码页。如果打印字符,则使用Python 2.7会得到相同的行为:
>>> import io
>>> io.StringIO(u'\ud801').getvalue()
u'\ud801'
>>> print(io.StringIO(u'\ud801').getvalue())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\encodings\cp437.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\ud801' in position 0: character maps to <undefined>