无法理解CYK算法的伪代码

时间:2013-04-15 01:04:35

标签: java algorithm parsing cyk

我正在阅读有关CYK algorithm的内容,并且有一部分我无法理解的伪代码。整个伪代码是:

let the input be a string S consisting of n characters: a1 ... an.
let the grammar contain r nonterminal symbols R1 ... Rr.
This grammar contains the subset Rs which is the set of start symbols.
let P[n,n,r] be an array of booleans. Initialize all elements of P to false.
for each i = 1 to n
  for each unit production Rj -> ai
    set P[i,1,j] = true
for each i = 2 to n -- Length of span
  for each j = 1 to n-i+1 -- Start of span
    for each k = 1 to i-1 -- Partition of span
      for each production RA -> RB RC
        if P[j,k,B] and P[j+k,i-k,C] then set P[j,i,A] = true
if any of P[1,n,x] is true (x is iterated over the set s, where s are all the indices for Rs) then
  S is member of language
else
  S is not member of language

这些部分让我很困惑:

    for each production RA -> RB RC
      if P[j,k,B] and P[j+k,i-k,C] then set P[j,i,A] = true

有人会提供一些关于这些伪代码的提示吗?

1 个答案:

答案 0 :(得分:3)

伪代码

  

对于每个生产R A →R B R C

     

如果P [j,k,B]和P [j + k,i-k,C]则设置P [j,i,A] =真

应按以下方式解释。假设P [j,k,B]为真。这意味着从位置j开始的k个字符形成的字符串可以来自非终结符R B 。如果也是P [j + k,i - k,C]为真的情况,那么从位置j + k开始的i - k个字符形成的字符串可以从非终结R C 。因此,由于R A →R B R C 是一个产生,所以从i位置开始形成的字符串就是这种情况可以从R A 派生。

我认为将伪代码解释为

可能会有所帮助
  

对于每个生产R A →R B R C

     

如果P [j,k,B] == true且P [j + k,i-k,C] == true,则设置P [j,i,A] = true

希望这有帮助!