Question

我正在尝试使用appcfg.py upload_data上传数据。我的CSV编码为ANSI，但Alex Martelli说它应该是UTF-8。所以我改用它（使用Notepad ++）。

这会在我文件的第一个字符出现错误：

UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 0: ordinal not in range(128)

然后我切换回ANSI，我明白了：

Error: new-line character seen in unquoted field - do you need to open the file in universal-newline mode?

嗯......看起来其他人有类似的问题here。如何使用Notepad ++最有效地删除每行末尾的换行符？或者我还应该做些什么吗？

Answer 1

通过HTTP获取文件时，我遇到了类似的问题，可能是UTF-8。我通过使用以下方法将字符串转换为unicode来修复它：

unicodecontent = unicode(content, 'utf8')

然后每当我需要以ascii的形式访问它时，我会将其编码为UTF-8：

unicodecontent.encode('utf_8')

当我尝试使用ElementTree（fromstring）

解析XML文件时，这对我有用