Question

根据这些规则，我在使用perl正则表达式更改\字符时遇到问题：

匹配序列应以\(
它应以\)
上一个匹配序列中的任何\字符都应替换为双反斜杠\\

示例文字参考：

Se la \probabilit&agrave; dell'evento\ A &egrave; \(\frac{3}{4} \) e la
probabilit&agrave; dell'evento B &egrave; \(\frac{1}{4}\)&nbsp;
\(\frac{3}{4} +\frac{3}{4}\)&nbsp;.
\(\frac{1}{4} - \frac{3}{4}\)&nbsp;.
\(\frac{3}{16}\)&nbsp;.
\(\frac{1}{2}\)&nbsp;.

应该成为：

Se la \probabilit&agrave; dell'evento\ A &egrave; \\(\\frac{3}{4} \\) e la
probabilit&agrave; dell'evento B &egrave; \\(\\frac{1}{4}\\)&nbsp;
\\(\\frac{3}{4} +\\frac{3}{4}\\)&nbsp;.
\\(\\frac{1}{4} - \\frac{3}{4}\\)&nbsp;.
\\(\\frac{3}{16}\\)&nbsp;.
\\(\\frac{1}{2}\\)&nbsp;.

到目前为止，这是我最好的选择：

s/(\\\()(.*)(\\)(.*)(\\\))/\\\\\($2\\\\$4\\\\\)/mg

产生：

Se la \probabilit&agrave; dell'evento\ A &egrave; \\(\\frac{3}{4} \\) e la
probabilit&agrave; dell'evento B &egrave; \\(\\frac{1}{4}\\)&nbsp;
\\(\frac{3}{4} +\\frac{3}{4}\\)&nbsp;.
\\(\frac{1}{4} - \\frac{3}{4}\\)&nbsp;.
\\(\\frac{3}{16}\\)&nbsp;.
\\(\\frac{1}{2}\\)&nbsp;.

如你所见

\\(\frac{3}{4} +\\frac{3}{4}\\)&nbsp;.
\\(\frac{1}{4} - \\frac{3}{4}\\)&nbsp;.

错了。

如何修改我的正则表达式以满足我的需求？

Answer 1

我测试了@sln regex

s/(?x)(?:(?!\A)\G[^\\]*\K\\|\\(?=\())(?=.*?(?<=\\)\))/\\\\/g;

它似乎有效，虽然它对我来说仍然是一个神秘的奥秘。

更新说明

Formatted and tested:

 (?s)               # Inline Dot-All modifier
 (?:                # Cluster start
      (?! \A )           # Not beginning of string
      \G                 # G anchor - If matched before, start at end of last match
      [^\\]*             # Many non-escape's
      \K                 # Previous is not part of match
      \\                 # A lone escape
   |                   # or,
                         # Start of an opening '\('
      \\                 # A lone escape
      (?= \( )           #   followed by an open parenth
 )                  # Cluster end
 (?=                # Lookahead, each match validates a final '\)'
      .*? 
      (?<= \\ )
      \) 
 )

Answer 2

从原始版本发布更新的正则表达式。

原文在所有转义结尾处进行了验证看完之后，只需进行验证就可以加速有一次它找到了开放区块。

底部是比较两种方法的基准。

更新了正则表达式：

$str =~ s/(?s)(?:(?!\A)\G(?!\))[^\\]*\K\\|\$?=\(.*?\\$))/\\\\/g;

Formatted and tested:

 (?s)               # Dot-All modifier
 (?:                # Cluster start
      (?! \A )           # Not beginning of string
      \G                 # G anchor - If matched before, start at end of last match
      (?! \) )           # Last was an escape, so ')' ends the block
      [^\\]*             # Many non-escape's
      \K                 # Previous is not part of match
      \\                 # A lone escape
   |                   # or,
                         # New Block Check - 
      \\                 # A lone escape then,
      (?=                # One time Validation:
           \(                 #  an opening '('
           .*?                #  anything
           \\ \)              #  then a final '\)'
      )                  # -------------
 )                  # Cluster end

基准：

示例$ \\\\\\\\\\\\\\\\\\\\\\\\\\\\\ $

结果

New Regex:   (?s)(?:(?!\A)\G(?!\))[^\\]*\K\\|\\(?=\(.*?\\\)))
Options:  < none >
Completed iterations:   50  /  50     ( x 1000 )
Matches found per iteration:   31
Elapsed Time:    1.25 s,   1253.92 ms,   1253924 µs


Old Regex:   (?s)(?:(?!\A)\G[^\\]*\K\\|\\(?=\())(?=.*?(?<=\\)\))
Options:  < none >
Completed iterations:   50  /  50     ( x 1000 )
Matches found per iteration:   31
Elapsed Time:    3.95 s,   3952.31 ms,   3952307 µs

用于mathjax语法的Perl Regex

2 个答案: