如何将各种编码的csv文件转换为utf-8

时间:2017-11-26 07:18:20

标签: python csv encoding utf-8

我从网上下载了95个小型CSV文件。他们的模式应该非常相似。我试图将它们与Python pandas连接起来,但是在调用pd.read_csv时,这些文件的各种编码会导致问题,我不确定将它们转换为一致编码的最佳方法是什么,例如: UTF-8。编码包括

ASCII text, with CRLF line terminators
Little-endian UTF-16 Unicode English text, with CRLF line terminators
Little-endian UTF-16 Unicode text, with CRLF line terminators
Little-endian UTF-16 Unicode text, with CRLF, CR line terminators
UTF-8 Unicode (with BOM) English text, with CRLF line terminators
UTF-8 Unicode (with BOM) text, with CRLF line terminators

上面的列表是用

生成的
file -b *.csv | sort | uniq

1 个答案:

答案 0 :(得分:0)

你有没有尝试过写作:

import pandas as pd
df=pd.read_csv(file,encoding='utf-8')