在文本文件python中插入行到模式

时间:2016-03-23 11:32:16

标签: regex python-2.7 io

我有一个文本文件,其形式为:

first thing:    content 1
second thing:   content 2
third thing:    content 3
fourth thing:   content 4

此模式在整个文本文件中重复出现。但是,有时其中一行完全消失了:

first thing:    content 1
second thing:   content 2
fourth thing:   content 4

如何在文档中搜索这些缺失的行,然后将其添加回值为" NA"或者一些填充符来生成这样的新文本文件:

# 'third thing' was not there, so re-adding it with NA as content
first thing:    content 1
second thing:   content 2
third thing:    NA 
fourth thing:   content 4

当前代码样板:

with open('original.txt, 'r') as in:
    with open('output.txt', 'wb') as out:
        #Search file for pattern (Maybe regex?)
        #If pattern does not exist, add the line

感谢您提供的所有帮助!

2 个答案:

答案 0 :(得分:1)

您必须查找1-3行(少于4行),然后是换行符:

^\n([^\n]*\n){1,3}\n

演示:https://regex101.com/r/rL3eA5/2

答案 1 :(得分:1)

这不太好,但它确实有效。这是一个检测缺少行的正则表达式:

(?:^|\n)(second thing:\s*[^\n]+\n)|(first thing:\s*[^\n]+\n(?!second thing:))|(second thing:\s*[^\n]+\n(?!third thing:))|(third thing:\s*[^\n]+\n(?!fourth thing:))|(third thing:\s*[^\n]+\n\n)

regex101 demo here

注意Single Line标志。

当您有匹配时,请检查匹配的匹配组。如果是第一个,则缺少第一行。如果它是第二行,则缺少第二行,依此类推第三行和第四行。

Here's an example how to replace if the 1'st group got a match

Here's an example how to replace if the 3'rd group got a match

Here's an example how to replace if the 4'rd group got a match

你可能不得不做一些调整,但它应该让你顺利;)

问候。