我有一个csv文件,如下所示,只有一列(cust_code)带有引号,并且每一行也带有引号
“CUST_CODE”
“CST001001”
“CST000235”
“CST010231”
“CST010235”
“CST010231”
“CST010235”
“CST010231”
“CST040015”
我试图用熊猫读取该文件,但我收到了错误消息
'utf-8'编解码器无法解码位置0的字节0x93:无效的起始字节
另外,我尝试通过将编码类型传递为ascii和utf-8 但没有任何作用
答案 0 :(得分:1)
尝试通过encoding='cp1252'
。确保将'Documents\Book1.csv'
换成您在以下文件中的任何文件路径:
df = pd.read_csv('Documents\Book1.csv', encoding='cp1252')
df
“CUST_CODE”
0 “CST001001”
1 “CST000235”
2 “CST010231”
3 “CST010235”
4 “CST010231”
5 “CST010235”
6 “CST010231”
7 “CST040015”
这是维基百科,其中提供有关该编码类型的更多信息:https://en.wikipedia.org/wiki/Windows-1252。引自维基百科的文章:
"...common result was that all the quotes and apostrophes (produced by "smart quotes" in word-processing software) were replaced with question marks or boxes on non-Windows operating systems, making text difficult to read."