正则表达式只匹配一个多行字符串w /关键字在适当的位置

时间:2018-05-09 17:34:26

标签: regex pcre

我有一个带有块的虚拟乐器文件,这些块可能包含如下所示的任何属性组合:

POINT=69
    Name="M_Frequency Min" Type=ANALOG
    Units="Hz"
    Archive="AVERAGE" Priority=9999 Latch=0
    HysEnable=0 HysVal=0.00000
    Bit="0"
    Category="Meter"
    IsCustom=1
    Interval=0
    Accumulated=0
    DisplayOrder=1
ENDPOINT
POINT=70
    Name="M_Voltage Phase A-N Max" Type=ANALOG
    Units="Volts"
    Archive="AVERAGE" Priority=9999 Latch=0
    HysEnable=0 HysVal=0.000000
    CritHiEnable=0 CritHiLimit=0.000000
    CritLoEnable=0 CritLoLimit=0.000000
    CautHiEnable=0 CautHiLimit=0.000000
    CautLoEnable=0 CautLoLimit=0.000000
    Desc="Voltage Phase A-N Max"
    RW=READ
    Register="9000"
    RegType="H"
    DataType="F"
    Accumulated=0
    DisplayOrder=1
ENDPOINT

说,我想只使用类似POINT=[0-9]*(?s)(.*?)(?!ENDPOINT)(\sMax)(.*?)ENDPOINT

的内容匹配第二个块(而不是第一个块)

我的想法是,如果我设置我的点星也匹配换行符但是告诉它只匹配懒惰它会停止,如果它向前看并看到一些取消比赛的资格。显然,我在这里没有得到什么。

当然,这不起作用,而是找到匹配的整个文本。我也试过使用负面字符集,但也没有骰子。 我想要匹配的是一个POINT到ENDPOINT块,只有它有我想要的字符串"最大"我想取消一个终止于" ENDPOINT"在找到之前#34;最大"

EDIT1:您可以假设在显示的代码段之前和之后会有更多像这样的块。我特意试图获取包含目标字符串的块(因此我可以将其替换为另一个,或删除)。其他块可能有也可能没有目标字符串,但如果有,我想分别匹配每个块,而不是单个匹配。

1 个答案:

答案 0 :(得分:1)

检查the following regex

^\s*POINT=\d+\s*$  # A line matching to the word POINT,
                   # followed by the character '=' and one
                   # or more decimal digits surrounded by
                   # whitespace characters.
(?:\r?\n)+   # A zero or one character '\r' before the
             # character '\n'. This sequence may be
             # repeated one or more times.
  (?:                     # Zero or more lines that is not
    ^(?!                  # matched with the ENDPOINT word
      \s*(?:POINT=\d+|    # or the word POINT followed by
            ENDPOINT)\s*$ # the character '=' and zero or
    ).*$                  # more decimal digits surrounded
    (?:\r?\n)+            # by whitespace characters.
  )*
                  # A line that starts with one or more
                  # characters that are not equal to the
  ^[^=]+=.*Max.*$ # '=' character, followed by the '='
                  # character, and finally the word Max
                  # followed by zero or more characters.
  (?:\r?\n)+
  (?:
    ^(?!
      \s*(?:POINT=\d+|ENDPOINT)\s*$
    ).*$
    (?:\r?\n)+
  )*
^\s*ENDPOINT\s*$ # A line matching to the word ENDPOINT,
                 # surrounded by whitespace characters.