在python 3中,如何在没有UnicodeEncodeError的情况下打印字符'\ u2212'(减号“ - ”)?

时间:2015-07-29 20:55:51

标签: python python-3.x python-3.4

我需要处理一些包含大量“ - ”('\ u2212')的Excel文件,以及其他字符。经过大量尝试后,我甚至无法在屏幕上打印,或将其保存到文件中:

a='−'
print(a.encode('utf-8')) # print b'\xe2\x88\x92'
print(a)     # raise UnicodeEncodeError: 'gbk' codec can't encode character '\u2212' in position 0: illegal multibyte sequence
with open('test.txt','w') as file:
    file.write(a)      # raise UnicodeEncodeError: 'gbk' codec can't encode character '\u2212' in position 0: illegal multibyte sequence

在此页面中:https://docs.python.org/3.4/howto/unicode.html,它将其替换为其他一些字符,但我必须将其打印出来,或至少将其正确写入文件:

>>> u = chr(40960) + 'abcd' + chr(1972)
>>> u.encode('utf-8')
b'\xea\x80\x80abcd\xde\xb4'
>>> u.encode('ascii')  
Traceback (most recent call last):
    ...
UnicodeEncodeError: 'ascii' codec can't encode character '\ua000' in
  position 0: ordinal not in range(128)
>>> u.encode('ascii', 'ignore')
b'abcd'
>>> u.encode('ascii', 'replace')
b'?abcd?'
>>> u.encode('ascii', 'xmlcharrefreplace')
b'ꀀabcd޴'
>>> u.encode('ascii', 'backslashreplace')
b'\\ua000abcd\\u07b4'

我该怎么做?

0 个答案:

没有答案