Question

我正在进行有关正则表达式的练习，我真的不确定如何做到这一点。

正则表达式为：

((a*)(b*))* ∪ (a*)

我对此非常不好，但我认为((a*)(b*))*可以简化为(a ∪ b)*但如果这是正确的，那么最后∪ (a*)实际上只是重复，所以我想想整个表达式可以简化为(a ∪ b)*.这看起来是否正确？

编辑：∪代表联盟

Answer 1

你是对的。 (a*b*)*可以匹配任何字符串和b，因此(a U b)*可以匹配，因此它们是等效的。 (a U b)* a*与a*相交a*，因此(a U b)*是(a U b)*的子集。因此，整个表达式可以简化为{{1}}。

Answer 2

((a*)(b*))*U(a*)的真正含义是（从here复制）

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  (                        group and capture to \1 (0 or more times
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    (                        group and capture to \2:
--------------------------------------------------------------------------------
      a*                       'a' (0 or more times (matching the
                               most amount possible))
--------------------------------------------------------------------------------
    )                        end of \2
--------------------------------------------------------------------------------
    (                        group and capture to \3:
--------------------------------------------------------------------------------
      b*                       'b' (0 or more times (matching the
                               most amount possible))
--------------------------------------------------------------------------------
    )                        end of \3
--------------------------------------------------------------------------------
  )*                       end of \1 (NOTE: because you are using a
                           quantifier on this capture, only the LAST
                           repetition of the captured pattern will be
                           stored in \1)
--------------------------------------------------------------------------------
  U                        'U'
--------------------------------------------------------------------------------
  (                        group and capture to \4:
--------------------------------------------------------------------------------
    a*                       'a' (0 or more times (matching the most
                             amount possible))
--------------------------------------------------------------------------------
  )                        end of \4

此表达式目前与所有这些序列匹配：abUa bU U aabbUaa aaUaa aaU Uaa bbU ababUaa aabbaabbUaa（查看here）

如果不删除捕获组和剩余的字母顺序，则无法简化此操作。

编辑：如果你的正则表达式语句中的U代表“union”，那么这个表达式无效。正则表达式无法结合任何东西。只有OR，您需要使用|（管道）。如果你想联合((a*)(b*))*和(a*)，那么它可能会((a*)(b*))*，但它仍会匹配abaab之类的内容。

尽管如此，在你的正则表达式语句中捕获组是没用的，所以像[ab]*这样的东西足以匹配任意数量的a和b。

简化正则表达式

2 个答案: