我从网上下载了95个小型CSV文件。他们的模式应该非常相似。我试图将它们与Python pandas连接起来,但是在调用pd.read_csv
时,这些文件的各种编码会导致问题,我不确定将它们转换为一致编码的最佳方法是什么,例如: UTF-8。编码包括
ASCII text, with CRLF line terminators
Little-endian UTF-16 Unicode English text, with CRLF line terminators
Little-endian UTF-16 Unicode text, with CRLF line terminators
Little-endian UTF-16 Unicode text, with CRLF, CR line terminators
UTF-8 Unicode (with BOM) English text, with CRLF line terminators
UTF-8 Unicode (with BOM) text, with CRLF line terminators
上面的列表是用
生成的file -b *.csv | sort | uniq
答案 0 :(得分:0)
你有没有尝试过写作:
import pandas as pd
df=pd.read_csv(file,encoding='utf-8')