模式匹配动态编程以获取建议

时间:2015-12-08 20:05:44

标签: python algorithm

处理下面的模式匹配问题。并发布详细的问题陈述和代码。代码正在运行。在下面的实现中,它循环用于外部循环中的模式,然后是内部循环以匹配源字符串 - 以便构建二维DP表。

我的问题是,如果我改变实现,哪个外部循环用于匹配源字符串,内部循环用于模式。会有任何性能提升或任何功能缺陷吗?关于哪种口味更好或几乎相同的任何建议都值得赞赏。

更具体地说,我的意思是从下面改变循环(使用类似的循环内容逻辑),

    for i in range(1, len(p) + 1):
        for j in range(1, len(s) + 1):

要,

    for i in range(1, len(s) + 1):
        for j in range(1, len(p) + 1):

问题陈述

  

''匹配任何单个字符。
  ' *'匹配前面元素的零个或多个。

     

匹配应覆盖整个输入字符串(非部分)。

     

功能原型应该是:
  bool isMatch(const char *s, const char *p)

     

一些例子:
  isMatch(" aa"," a")→false
  isMatch(" aa"," aa")→true
  isMatch(" aaa"," aa")→false
  isMatch(" aa"," a *")→true
  isMatch(" aa","。*")→true
  isMatch(" ab","。*")→true
  isMatch(" aab"," c * a * b")→true

class Solution(object):

    def isMatch(self, s, p):
        # The DP table and the string s and p use the same indexes i and j, but
        # table[i][j] means the match status between p[:i] and s[:j], i.e.
        # table[0][0] means the match status of two empty strings, and
        # table[1][1] means the match status of p[0] and s[0]. Therefore, when
        # refering to the i-th and the j-th characters of p and s for updating
        # table[i][j], we use p[i - 1] and s[j - 1].

        # Initialize the table with False. The first row is satisfied.
        table = [[False] * (len(s) + 1) for _ in range(len(p) + 1)]

        # Update the corner case of matching two empty strings.
        table[0][0] = True

        # Update the corner case of when s is an empty string but p is not.
        # Since each '*' can eliminate the charter before it, the table is
        # vertically updated by the one before previous. [test_symbol_0]
        for i in range(2, len(p) + 1):
            table[i][0] = table[i - 2][0] and p[i - 1] == '*'

        for i in range(1, len(p) + 1):
            for j in range(1, len(s) + 1):
                if p[i - 1] != "*":
                    # Update the table by referring the diagonal element.
                    table[i][j] = table[i - 1][j - 1] and \
                                  (p[i - 1] == s[j - 1] or p[i - 1] == '.')
                else:
                    # Eliminations (referring to the vertical element)
                    # Either refer to the one before previous or the previous.
                    # I.e. * eliminate the previous or count the previous.
                    # [test_symbol_1]
                    table[i][j] = table[i - 2][j] or table[i - 1][j]

                    # Propagations (referring to the horizontal element)
                    # If p's previous one is equal to the current s, with
                    # helps of *, the status can be propagated from the left.
                    # [test_symbol_2]
                    if p[i - 2] == s[j - 1] or p[i - 2] == '.':
                        table[i][j] |= table[i][j - 1]

        return table[-1][-1]
提前谢谢, 林

1 个答案:

答案 0 :(得分:2)

如果您交换循环,i将成为s的索引,j将成为p的索引。您需要在循环中的任何地方交换ij

    for i in range(1, len(s) + 1):
        for j in range(1, len(p) + 1):
            if p[j - 1] != "*":
                # Update the table by referring the diagonal element.
                table[j][i] = table[j - 1][i - 1] and \
                              (p[j - 1] == s[i - 1] or p[j - 1] == '.')
            else:
                # Eliminations (referring to the vertical element)
                # Either refer to the one before previous or the previous.
                # I.e. * eliminate the previous or count the previous.
                # [test_symbol_1]
                table[j][i] = table[j - 2][i] or table[j - 1][i]

                # Propagations (referring to the horizontal element)
                # If p's previous one is equal to the current s, with
                # helps of *, the status can be propagated from the left.
                # [test_symbol_2]
                if p[j - 2] == s[i - 1] or p[j - 2] == '.':
                    table[j][i] |= table[j][i - 1]

原始算法逐行填充table(第一行1,然后是2,3,......)。交换后,表格将逐列填充(第一列1,然后是2,3,......)。

算法的思想保持不变,因为table中的每个元素都是通过前一列或多行上的元素定义的 - 你已经计算过的元素,无论你是逐行还是列 - 逐列。

详细信息,table[j][i]是通过上一列table[j-1][i-1]的对角线元素定义的;或前一行和/或列table[j-2][i]table[j-1][i]和/或table[j][i-1]中的元素。

因此,交换后性能是相同的。在这两个版本中,table元素的每次计算都需要一个恒定的时间。构建table的总时间为O(len(s) * len(p))

交换后功能也相同。基本上,如果原始版本是正确的,那么修改后的版本也是正确的。原来是否正确是另一个故事......

让我们看看原始版本。乍一看,在i = 1table[i - 2][j]p[i - 2]时,它似乎在两个地方存在索引问题。

但是,Python将索引-1解释为最后一个元素。因此,table[-1][j]指的是table的最后一行,其中所有元素都是False。因此,table[1][j] = table[-1][j] or table[0][j]相当于table[1][j] = table[0][j]

对于p[-1],请注意您只能在if语句中访问p[0] = *(这对匹配没有意义)。 p[-1]的价值并不重要,因为它不会影响table[i][j]的价值。要看到这一点:如果if - 语句的结果恰好是True,我们知道table[1][0]最初是False,所以table[1][1],{{ 1}},...也必须是table[1][2]。换句话说,False不会匹配任何字符串。