CYK解析算法数据结构

时间:2018-09-21 01:55:14

标签: java arrays algorithm data-structures cyk

我正在阅读一种用于CYK解析的算法,但我不理解这种数据结构,即P [M,i,j] = new Tree(M,i,j,null,null,null,0.0);如何在Java中实现这种数组? 算法如下,

    class Tree {  
    NonTerm phrase %  The Non-terminal
    int startPhrase, int endPhrase; % indices of starting and ending word
    String word;   %  If a leaf, then the word 
    Tree left;     
    Tree right;    
    double prob;   
  }

    function CYK-PARSE(sentence,grammar) return P, a chart. {

1. N = length(sentence);
2. for (i = 1 to N) {
3.   word = sentence[i];
4.    for (each rule  "POS --> Word [prob]" in the grammar) 
5.       P[POS,i,i] = new Tree(POS,i,i,word,null,null,prob);
6.    }                           % endfor line 2.

7. for (length = 2 to N)          % length = length of phrase
8.   for (i = 1 to N+1-length) {  % i == start of phrase
9.     j = i+length-1;            % j == end of phrase
10.    for (each NonTerm M)  {
11.        P[M,i,j] = new Tree(M,i,j,null,null,null,0.0);
12.        for (k = i to j-1)    % k = end of first subphrase
13.            for (each rule "M -> Y,Z [prob]" in the grammar) {
14.                newProb = P[Y,i,k].prob * P[Z,k+1,j].prob * prob;
15.                if (newProb > P[M,i,j].prob) {
16.                   P[M,i,j].left = P[Y,i,k];
17.                   P[M,i,j].right = P[Z,k+1,j];
18.                   P[M,i,j].prob = newProb;
19.                }  % endif line 15 
20.            }      % endfor line 13
21.      }            % endfor line 10
22.    }              % endfor line 8

23. return P;
24. }  % end CYK-PARSE.

它说:“该过程中的主要数据结构是一个图表,它是数组P [M,I,J]。M在非终端上索引,I和J从1到N变长(长度P [M,I,J]是具有NonTerm == M,startPhrase == I和endPhrase == J的节点。 我不知道图表是什么。如果要用Java实现它,那么对于包含Tree对象的P [M,i,j],我将使用什么数据结构。

0 个答案:

没有答案