我有一个文本文件,其形式为:
first thing: content 1
second thing: content 2
third thing: content 3
fourth thing: content 4
此模式在整个文本文件中重复出现。但是,有时其中一行完全消失了:
first thing: content 1
second thing: content 2
fourth thing: content 4
如何在文档中搜索这些缺失的行,然后将其添加回值为" NA"或者一些填充符来生成这样的新文本文件:
# 'third thing' was not there, so re-adding it with NA as content
first thing: content 1
second thing: content 2
third thing: NA
fourth thing: content 4
当前代码样板:
with open('original.txt, 'r') as in:
with open('output.txt', 'wb') as out:
#Search file for pattern (Maybe regex?)
#If pattern does not exist, add the line
感谢您提供的所有帮助!
答案 0 :(得分:1)
答案 1 :(得分:1)
这不太好,但它确实有效。这是一个检测缺少行的正则表达式:
(?:^|\n)(second thing:\s*[^\n]+\n)|(first thing:\s*[^\n]+\n(?!second thing:))|(second thing:\s*[^\n]+\n(?!third thing:))|(third thing:\s*[^\n]+\n(?!fourth thing:))|(third thing:\s*[^\n]+\n\n)
注意Single Line
标志。
当您有匹配时,请检查匹配的匹配组。如果是第一个,则缺少第一行。如果它是第二行,则缺少第二行,依此类推第三行和第四行。
Here's an example how to replace if the 1'st group got a match
Here's an example how to replace if the 3'rd group got a match
Here's an example how to replace if the 4'rd group got a match
你可能不得不做一些调整,但它应该让你顺利;)
问候。