我有一个充满字符串的pandas系列:
In:
s = pd.Series(['This is a single line.', 'This is another one.', 'This is a string\nwith more than one line.'])
Out:
0 This is a single line.
1 This is another one.
2 This is a string\nwith more than one line.
dtype: object
如何将此系列中包含换行符\n
的所有行拆分为自己的行?我期望的是:
0 This is a single line.
1 This is another one.
2 This is a string
3 with more than one line.
dtype: object
我知道我可以用
分隔换行符来划分每一行s = s.str.split('\n')
给出了
0 [This is a single line.]
1 [This is another one.]
2 [This is a string, with more than one line.]
但这只会破坏行中的字符串,而不是每个令牌的行。
答案 0 :(得分:4)
您可以遍历每一行中的每个字符串以创建一个新系列:
pd.Series([j for i in s.str.split('\n') for j in i])
在输入上执行此操作可能更有意义,而不是创建临时系列,例如:
strings = ['This is a single line.', 'This is another one.', 'This is a string\nwith more than one line.']
pd.Series([j for i in strings for j in i.split('\n')])