我正在尝试将Python中的csv文件读入pandas数据框,但是在一列中使用了额外的逗号,因为它有一个范围
我有一个逗号分隔的csv,包含13列数据,但其中一列是一系列值,并使用额外的逗号。标题如下:
"A","B","C","D","E","F","G","H","I","J","K","L","M"
但每行数据中的值如下所示:
"A",B,B,"C","D","E","F","G","H",I,"J",K,L,M
我正在尝试将其读入Python中的Pandas数据帧,但由于不一致,它会将元组视为两列。我将如何更改csv以便更容易解析?
答案 0 :(得分:0)
您可以执行以下操作:
def fixed_lines(filename):
with open(filename) as f:
reader = csv.reader(f)
next(reader) # skip header(optional)
for row in reader:
yield row[:1] + [row[1] + ',' + row[2]] + row[3:]
pd.DataFrame(fixed_lines('filename.csv'))