给定两个正则表达式A = 0 * 1 * U 1 * 0 *和B =(01 U 10)*,如何确定一个是否为另一个的子集。我想一种方法是列出一些示例,看看它们是否有共同点。在这种情况下,我看到字符串01、10在两个集合中都共享。因此,它们不是彼此的子集吗?我怎么知道一个正则表达式是另一个的子集?通常,您如何处理此类问题?
答案 0 :(得分:2)
There are obviously a lot of ways to do this - any logical argument could constitute a valid proof. However, an instructive method of answering this question is to use algorithms to compute an answer to the general question.
Two languages are equal if each contains the other. If one language contains another, the difference of the contained language and the containing language is the empty set. Therefore, if two languages A and B are equal, then A \ B and B \ A are both empty; and if A \ B and B \ A are both empty, then A and B must be equal.
Given a regular expression, there is at least one known correct algorithm to convert it into an equivalent NFA with lambda/epsilon transitions. Such a construction is used in the canonical proof of the equivalence of regular expressions and finite automata.
Given an NFA with lambda/epsilon transitions, there is at least one known correct algorithm to convert it into an equivalent DFA. The subset construction is such an algorithm.
Given two DFAs there is at least one known correct algorithm to produce a DFA which accepts the difference of the languages accepted by those two DFAs. The Cartesian Product Machine construction is such an algorithm.
Given a DFA, there is an algorithm to determine whether it accepts the empty language. DFA minimization followed by a check for any accepting states is a such an algorithm.
Therefore, to algorithmically determine whether two regular expressions r1 and r2 are equal:
When it doubt, work it out