根据我们的了解,给定正则表达式模式(例如A B A B A C
),我们可以将其转换为DFA。在这个例子中,它就像一个链(你可以测试它here)。
这个“链状”DFA可以判断给定字符串是否与模式匹配(即接受/拒绝它);但无法判断字符串中是否有任何出现并识别所有字符串。
示例:
假设这是搜索字符串:A B C A B A B A B A C A B C
虽然从第6个字符开始出现,但“链状”DFA无法说明这一点。它所能做的就是拒绝这个字符串。
问题:是否可以设计支持此类功能的正则表达式?
(注意:我理解这个问题有点令人困惑;我想澄清它让你感到困惑。)
答案 0 :(得分:0)
The language of strings containing the substring ABABAC
is matched by the regular expression:
.*ABABAC.*
Where the symbol .
denotes a subexpression that matches any single input symbol (e.g. (A|B|C)
, if the input language only has the symbols A
, B
and C
). To tell if a string has the substring ABABAC
, you can build an NFA or a DFA from this regular expression, and check if it accepts your string.
Determining the location of the substring in the input string is not possible with a (single) standard N/DFA, simply because an N/DFA is defined to only return one bit of information (accept/reject). However, it is possible to implement an "augmented N/DFA" that, in addition to matching the input, also keeps track of where in the string each state transition last occurred; this information is enough to efficiently reconstruct the location of the substring.