如何使用python3在文件中打印非ASCII字符?

时间:2017-08-14 16:50:59

标签: python-3.x unicode

这是我的代码示例。你会看到这很简单。当我使用它从Ubuntu终端窗口打印文件时,我收到以下错误消息:

Traceback (most recent call last):
  File "/ascii_cat", line 22, in <module>
    print_file_in_ascii(f)
  File "/ascii_cat", line 16, in print_file_in_ascii
    for line in f:
  File "/usr/lib/python3.4/codecs.py", line 319, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

代码:

#!/usr/bin/python3

import sys

def contains_only_ascii(a_string):
    try:
        for a_char in a_string.strip():
            if ord(a_char) < 32 or ord(a_char) > 126:
                return False
    except:
        pass
    return True

def print_file_in_ascii(fname):
    with open(fname, "r") as f:
        for line in f:
            if contains_only_ascii(line) == True:
                print(line, end="")

# sys.argv may be multiple files when a * is using for a filename; globbing
for f in sys.argv[1:]:
    print_file_in_ascii(f)

1 个答案:

答案 0 :(得分:2)

您已使用默认编码打开文件,该编码在您的系统上为utf-8。该文件未以UTF-8编码,因此读取该文件会产生异常。

通过明确指定encoding=参数,以正确的编码打开文件:

with open(fname,encoding='whatever_the_encoding_really_is') as f: