修改CSV标头以匹配值

时间:2015-01-21 18:15:17

标签: python parsing csv pandas

我正在尝试将Python中的csv文件读入pandas数据框,但是在一列中使用了额外的逗号,因为它有一个范围

我有一个逗号分隔的csv,包含13列数据,但其中一列是一系列值,并使用额外的逗号。标题如下:

"A","B","C","D","E","F","G","H","I","J","K","L","M"

但每行数据中的值如下所示:

"A",B,B,"C","D","E","F","G","H",I,"J",K,L,M

我正在尝试将其读入Python中的Pandas数据帧,但由于不一致,它会将元组视为两列。我将如何更改csv以便更容易解析?

1 个答案:

答案 0 :(得分:0)

您可以执行以下操作:

def fixed_lines(filename):
    with open(filename) as f:
        reader = csv.reader(f)
        next(reader)       # skip header(optional)
        for row in reader:
            yield row[:1] + [row[1] + ',' + row[2]] + row[3:]

pd.DataFrame(fixed_lines('filename.csv'))