Question

我正在使用pandas将两个可能具有不同列标题的csv文件合并在一起。我遇到的问题似乎是随机分成新的一行。

File 1:
ID, Height
0 , 1
1 , 2
2 , 3

File 2:

ID, Message
0 , "Long string message"
1 , "May include tabs, multiple lines \n
     that go on for a while"
2 , "More of the same"

结果应该是：

ID, Height, Message
0,    1,     '',
1,    2,     '',
2,    3,     '',
0,    '',    "Long string message",
1,    '',    "May include tabs, multiple lines \n
              that go on for a while",
2,    '',    "More of the same"

我得到的是：

ID, Height, Message
0,    1,     '',
1,    2,     '',
2,    3,     '',
0,    '',    "Long string message",
1,    '',    "May include tabs, multiple lines"
"that go on for a while", '', '',
2,    '',    "More of the same"

我大部分时间都在使用以下内容：

first = pd.read_csv('file1.csv')
second = pd.read_csv('file2.csv')

merged = pd.concat([first, second], axis=0, ignore_index=True)
merged.to_csv('test.csv')

看起来如果消息字段中有额外的行，它会拆分为新行。如何根据消息字段中的新行停止分隔？

Answer 1

从简短的示例中，您看到它在新行\n上开始新行

您可以尝试使用first = pd.read_csv('file1.csv', delim_whitespace = True)

尝试更改separator here.

中的lineterminator，read_csv或类似分隔符的参数

Python Pandas to_csv获取额外的行

1 个答案: