用于mathjax语法的Perl Regex

时间:2016-02-23 21:05:36

标签: regex perl mathjax

根据这些规则,我在使用perl正则表达式更改\字符时遇到问题:

  1. 匹配序列应以\(
  2. 开头
  3. 它应以\)
  4. 结尾
  5. 上一个匹配序列中的任何\字符都应替换为双反斜杠\\
  6. 示例文字参考:

    Se la \probabilità dell'evento\ A è \(\frac{3}{4} \) e la
    probabilità dell'evento B è \(\frac{1}{4}\) 
    \(\frac{3}{4} +\frac{3}{4}\) .
    \(\frac{1}{4} - \frac{3}{4}\) .
    \(\frac{3}{16}\) .
    \(\frac{1}{2}\) .
    

    应该成为:

    Se la \probabilità dell'evento\ A è \\(\\frac{3}{4} \\) e la
    probabilità dell'evento B è \\(\\frac{1}{4}\\) 
    \\(\\frac{3}{4} +\\frac{3}{4}\\) .
    \\(\\frac{1}{4} - \\frac{3}{4}\\) .
    \\(\\frac{3}{16}\\) .
    \\(\\frac{1}{2}\\) .
    

    到目前为止,这是我最好的选择:

    s/(\\\()(.*)(\\)(.*)(\\\))/\\\\\($2\\\\$4\\\\\)/mg
    

    产生:

    Se la \probabilità dell'evento\ A è \\(\\frac{3}{4} \\) e la
    probabilità dell'evento B è \\(\\frac{1}{4}\\) 
    \\(\frac{3}{4} +\\frac{3}{4}\\) .
    \\(\frac{1}{4} - \\frac{3}{4}\\) .
    \\(\\frac{3}{16}\\) .
    \\(\\frac{1}{2}\\) .
    

    如你所见

    \\(\frac{3}{4} +\\frac{3}{4}\\) .
    \\(\frac{1}{4} - \\frac{3}{4}\\) .
    

    错了。

    如何修改我的正则表达式以满足我的需求?

2 个答案:

答案 0 :(得分:1)

我测试了@sln regex

s/(?x)(?:(?!\A)\G[^\\]*\K\\|\\(?=\())(?=.*?(?<=\\)\))/\\\\/g;

它似乎有效,虽然它对我来说仍然是一个神秘的奥秘。

更新说明

Formatted and tested:

 (?s)               # Inline Dot-All modifier
 (?:                # Cluster start
      (?! \A )           # Not beginning of string
      \G                 # G anchor - If matched before, start at end of last match
      [^\\]*             # Many non-escape's
      \K                 # Previous is not part of match
      \\                 # A lone escape
   |                   # or,
                         # Start of an opening '\('
      \\                 # A lone escape
      (?= \( )           #   followed by an open parenth
 )                  # Cluster end
 (?=                # Lookahead, each match validates a final '\)'
      .*? 
      (?<= \\ )
      \) 
 )

答案 1 :(得分:1)

从原始版本发布更新的正则表达式。

原文在所有转义结尾处进行了验证 看完之后,只需进行验证就可以加速 有一次它找到了开放区块。

底部是比较两种方法的基准。

更新了正则表达式:

$str =~ s/(?s)(?:(?!\A)\G(?!\))[^\\]*\K\\|\\(?=\(.*?\\\)))/\\\\/g;

Formatted and tested:

 (?s)               # Dot-All modifier
 (?:                # Cluster start
      (?! \A )           # Not beginning of string
      \G                 # G anchor - If matched before, start at end of last match
      (?! \) )           # Last was an escape, so ')' ends the block
      [^\\]*             # Many non-escape's
      \K                 # Previous is not part of match
      \\                 # A lone escape
   |                   # or,
                         # New Block Check - 
      \\                 # A lone escape then,
      (?=                # One time Validation:
           \(                 #  an opening '('
           .*?                #  anything
           \\ \)              #  then a final '\)'
      )                  # -------------
 )                  # Cluster end

基准:

示例\( \\\\\\\\\\\\\\\\\\\\\\\\\\\\\ \)

结果

New Regex:   (?s)(?:(?!\A)\G(?!\))[^\\]*\K\\|\\(?=\(.*?\\\)))
Options:  < none >
Completed iterations:   50  /  50     ( x 1000 )
Matches found per iteration:   31
Elapsed Time:    1.25 s,   1253.92 ms,   1253924 µs


Old Regex:   (?s)(?:(?!\A)\G[^\\]*\K\\|\\(?=\())(?=.*?(?<=\\)\))
Options:  < none >
Completed iterations:   50  /  50     ( x 1000 )
Matches found per iteration:   31
Elapsed Time:    3.95 s,   3952.31 ms,   3952307 µs