我有一个大的csv索引,x列数和y行数。我希望我的代码遍历每个csv(同时索引循环)并将具有特定标头的列组合到一个新列中,然后将csv保存到新路径中。这是我的代码到目前为止,但我收到错误:
'utf-8' codec can't decode byte 0xa9 in position 33: invalid start byte
有什么想法吗?
import os
import pandas as pd
#code to add new row to all csvs with unique identifier stamp that combines
the following:
#wellkey+drillkey+lat+long+spuddate
files=['Apr 23 2018.csv','Apr 20 2018.csv']
index=0
os.chdir('file path')
#code to loop through all the files listed above
while index < len(files):
os.chdir('file path')
current_file=files[index]
#unique identifier column
df=pd.read_csv(current_file)
df['Unique Identifier']=df['A'] + "-" + df['B'] + "-" + df['C'] + "-" +
df['D'] + "-" + df['E']
df.to_csv(current_file)
#save new csv
os.chdir('New file Path')
index = index + 1
感谢您的建议/意见/更正。
答案 0 :(得分:0)
当我遇到这个问题时,我要尝试的第一件事是将encoding='ISO-8859-1
添加到我的pd.read_csv()
声明中
所以你的陈述将如下所示:
df=pd.read_csv(current_file, encoding='ISO-8859-1')