Question

我试图通过使用正则表达式捕获只有给定组之间差异的表达式。例如，我需要捕获这些（粗体）：

;TEXT;2;34;1;0;;;;;;3200;
PRINT_Polohopis.dgn;Different TEXT;2;64;1;0;;;;;;3200;

但不是这些（如果它是相同的）：

;TEXT;2;34;1;0;;;;;;3200;
PRINT_Polohopis.dgn;TEXT;2;64;1;0;;;;;;3200;

到目前为止，我设法创建了这个正则表达式：

^;([\w\s]*;).*\n(?:[\w\s_\.]*);(?:(?!(\1))(\K[\w\s]*;))

仅当我在捕获组中包含分号时才有效。是否有可能以更好的方式捕获这些群体？

Answer 1

这样的事可能适合你：

/^;([^;]+);.*?\n[^;]+;(?!\1;)([^;]+)/

Try it online

这里的诀窍是使用负面的替代品来确保\1（背面参考）不在所需的位置：

/^;                                 / # Start of string and literal ;
   ([^;]+);                           # Capture all but ; followed by literal ;
           .*?\n                      # Match rest of line
                [^;]+;                # Match all but ; followed by literal ;
                      (?!\1;)         # Negative lookahead to make sure captured
                                      # group is no at this position, followed
                                      # by literal ;
                             ([^;]+)  # Capture all but ;

如何仅匹配包含不相等组的表达式？

1 个答案: