我正在使用pandas将两个可能具有不同列标题的csv文件合并在一起。我遇到的问题似乎是随机分成新的一行。
File 1:
ID, Height
0 , 1
1 , 2
2 , 3
File 2:
ID, Message
0 , "Long string message"
1 , "May include tabs, multiple lines \n
that go on for a while"
2 , "More of the same"
结果应该是:
ID, Height, Message
0, 1, '',
1, 2, '',
2, 3, '',
0, '', "Long string message",
1, '', "May include tabs, multiple lines \n
that go on for a while",
2, '', "More of the same"
我得到的是:
ID, Height, Message
0, 1, '',
1, 2, '',
2, 3, '',
0, '', "Long string message",
1, '', "May include tabs, multiple lines"
"that go on for a while", '', '',
2, '', "More of the same"
我大部分时间都在使用以下内容:
first = pd.read_csv('file1.csv')
second = pd.read_csv('file2.csv')
merged = pd.concat([first, second], axis=0, ignore_index=True)
merged.to_csv('test.csv')
看起来如果消息字段中有额外的行,它会拆分为新行。如何根据消息字段中的新行停止分隔?
答案 0 :(得分:1)
从简短的示例中,您看到它在新行\n
上开始新行
您可以尝试使用first = pd.read_csv('file1.csv', delim_whitespace = True)
尝试更改separator
here.
lineterminator
,read_csv
或类似分隔符的参数