使用熊猫读取列名称具有标点符号的csv

时间:2020-08-05 23:04:42

标签: pandas csv

我有一个csv文件,如下所示,只有一列(cust_code)带有引号,并且每一行也带有引号

“CUST_CODE”
“CST001001”
“CST000235”
“CST010231”
“CST010235”
“CST010231”
“CST010235”
“CST010231”
“CST040015”

我试图用熊猫读取该文件,但我收到了错误消息

'utf-8'编解码器无法解码位置0的字节0x93:无效的起始字节

另外,我尝试通过将编码类型传递为ascii和utf-8 但没有任何作用

1 个答案:

答案 0 :(得分:1)

尝试通过encoding='cp1252'。确保将'Documents\Book1.csv'换成您在以下文件中的任何文件路径:

df = pd.read_csv('Documents\Book1.csv', encoding='cp1252')
df

    “CUST_CODE”
0   “CST001001”
1   “CST000235”
2   “CST010231”
3   “CST010235”
4   “CST010231”
5   “CST010235”
6   “CST010231”
7   “CST040015”

这是维基百科,其中提供有关该编码类型的更多信息:https://en.wikipedia.org/wiki/Windows-1252。引自维基百科的文章:

"...common result was that all the quotes and apostrophes (produced by "smart quotes" in word-processing software) were replaced with question marks or boxes on non-Windows operating systems, making text difficult to read."