如何将与换行符匹配的Notepad ++ Scintilla正则表达式转换为Python?

时间:2018-01-31 21:32:20

标签: python regex notepad++

我在文件中有以下文字。我也在程序中尝试了它作为一个字符串(使用'''mytext''')。

RECORD1  Sed similique nostrum quibusdam minus. Rerum repudiandae et ipsum numquam commodi repellendus. Aut minima ratione vel 
beatae minima reprehenderit provident neque. Earum quam temporibus repudiandae quidem officiis
RECORD2 Sed similique nostrum quibusdam minus. Rerum repudiandae et ipsum numquam commodi repellendus. Aut minima ratione vel 
beatae minima reprehenderit provident neque. Earum quam temporibus repudiandae quidem officiis
RECORD3   It is a long established fact that a reader will be distracted by the readable content of a page when looking at its 
layout. 
RECORD4 

如果我使用Notepad ++的发现,

(RECORD.*?\s).*?(?=(RECORD.*?\s)) (我检查换行符)

我可以从RECORDx匹配到下一个RECORDx之前。换句话说,由于我向前看,我在下面得到了这个。

RECORD1  Sed similique nostrum quibusdam minus. Rerum repudiandae et ipsum numquam commodi repellendus. Aut minima ratione vel 
beatae minima reprehenderit provident neque. Earum quam temporibus repudiandae quidem officiis

所以我只得到了我需要的记录。它通过Notepad ++中的正面预测(?=(RECORD.*?\s))和“匹配换行符”来实现这一点。这似乎不适用于Python,我不知道如何正确格式化它。我如何像在Notepad ++中那样在Python中进行预测?

我看过这个,https://markantoniou.blogspot.com/2008/06/notepad-how-to-use-regular-expressions.html

但我不知道该怎么做。

这是我的Python,它返回到没有任何提示。我知道re正在运行,因为我可以执行.*这样的操作,但它可以正常工作,甚至可以(RECORD.*?\s)只返回文字RECORD。

import re
regex = r"(RECORD.*?\s).*?(?=(RECORD.*?\s))"
filepath = 'test.txt'
with open(filepath) as fp:
    data = fp.read()
matches = re.finditer(regex, data)
for matchNum, match in enumerate(matches):
    matchNum = matchNum + 1

    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1

        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

(我做了实际的正则表达式和readfile,但其余大部分是从https://regex101.com/生成的)在其他地方,我看过这里,Python regex positive look ahead,我尝试了各种模式组合Python和Notepad ++。

0 个答案:

没有答案