Question

我有一些START和END标签的文字：

SOURCE = '''
Text with \n \n and some more # an so ..

other text to be ignored
START
docu \n this text \n I need includive the capital start and end
but do not split \n \n only split at the actuall end of the line
END

gfsdfgadgfg \n\n\n \n
5 635634
START
similar # to the above I need \n all of this in the split line
but do not split \n \n only split at the actuall end of the line
END


more text to ignore
'''

并希望将其改为像

这样的东西

parts_splitted_by_actual_end_of_line = {
'Part1_lines' : 
['START',
'docu \n this text \n I need includive the capital start and end',
'but do not split \n \n only split at the actuall end of the line',
'END'],

'Part1_lines' : 
['START',
'similar # to the above I need \n all of this in the split line',
'but do not split \n \n only split at the actuall end of the line',
'END'],
}

我可以找到带有字符串查找的START和END标记，并在其间提取文字。

但是我完全坚持要将\n分成线并将其分开？

任何建议都会非常感激。

Answer 1

您想使用原始字符串。在字符串文字之前添加一个r前缀，如下所示：

SOURCE = r'''Insert text here\n'''

这将为您转义换行符。

要稍后取消它（可能在你的分裂后），取出字符串并解码它：

string = string.decode('string_escape')

在行尾分割文本：忽略内联\ n

1 个答案: