Question

我写了一个简短的脚本来从一个文件中读取，该文件包含有关博客文章的信息。文件中的每一行对应一篇文章，以标签分隔的列包含文章“id＆＃39;”标题和段落等信息。

id  title   paragraph
1   Motorola prototypes from Frog   Some cool looking concepts for phones, watches etc
2   Digital everything  This new york times article talks about the willingness of consumers
3   E-mails banned at summer camps  E-mails compound feelings of homesickness in kids
4   Simple Multimedia Websites/e-mail   This is a sort of website/e-mail generation site
5   Campground wi-fi    Wi-fi is now on the list of amenities offered at many campgrounds
6   Fog screen  Literally, a screen made by projecting onto fog

此代码按＆＃39; \ n＆＃39;分割文件。这样每篇文章都是列表中的一个元素：

# Open file and skip first line(headers)
file = open("RBArticlesTabClean.txt", "r", encoding="utf-8")
file.readline()
# Read and decode whole file
articlesFile = htmlcodes.decodeString(file.read()).lower()
# Split file into its lines
articlesFileList = articlesFile.split("\n")

为了测试这是否正常以及程序是否正确读取文件，我遍历获得的文章列表，并打印出整个文件：

for each in articlesFileList:
    input(each)

在IDLE中运行时，它按预期工作，每次用户按下回车键时打印出每一行（小写）。

但是，当脚本通过命令提示符运行时，在打印三篇文章后它会失败，并出现此错误：

Traceback (most recent call last):
  File "E:\Python\RBTrends\RBTrendsAnalysis.py", line 52, in <module>
    print(each)
  File "C:\Python34\lib\encodings\cp850.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2019' in position 89: character maps to <undefined>

我有两个问题：

1）为什么我收到此错误？

2）为什么在IDLE和命令提示符下运行程序有区别？

Answer 1

据我所知，IDLE能够显示unicode字符，而命令提示符不能比普通的旧ascii更好。这就是您遇到此错误的原因。

为什么在IDLE中运行此脚本时没有收到错误？

1 个答案: