NLTK fcfg解析错误:访问SEM功能时的IndexError

时间:2018-01-14 12:17:14

标签: python nltk

仍然致力于在将NL转换为SQL的小应用程序中进一步扩展我的语法。做了比我下面解释的更复杂的例子,所以我很困惑为什么这个不解析,尽管它很简单。

我有工作“放映电影”,但我也希望能够处理“电影列表”,并将fcfg语法定义如下:

% start S
S[SEM=(?np)] -> NP[SEM=?np]
NP[SEM=(?v + ?n)] -> V[SEM=?v] N[SEM=?n]
V[SEM='SELECT'] -> 'show' | 'list' | 'display'
N[SEM='title, description, category, rating FROM film_list'] -> 'movie' | 'movies' | 'film' | 'films'

注意名词首先出现。当trace = 2

时,我得到以下解析日志
|.movie. list.|
Traceback (most recent call last):
Leaf Init Rule:
  File "C:/Users/JP/PycharmProjects/imat5112/misc/test7.py", line 8, in <module>
|[-----]     .| [0:1] 'movie'
    top_tree_semantics = trees[0].label()['SEM']  # get first list entry with semantics
|.     [-----]| [1:2] 'list'
IndexError: list index out of range
Feature Bottom Up Predict Combine Rule:
|[-----]     .| [0:1] N[SEM='title, description, category, rating FROM film_list'] -> 'movie' *
Feature Bottom Up Predict Combine Rule:
|.     [-----]| [1:2] V[SEM='SELECT'] -> 'list' *
Feature Bottom Up Predict Combine Rule:
|.     [----->| [1:2] NP[SEM=(?v+?n)] -> V[SEM=?v] * N[SEM=?n] {?v: 'SELECT'}

使用以下代码解析:

feature_cfg = load_parser('grammer_004_pos.fcfg', trace=2)
nlquery = 'movie list'
trees = list(feature_cfg.parse(nlquery.split()))  # Put all tuples into list
top_tree_semantics = trees[0].label()['SEM']  # get first list entry with semantics
top_tree_semantics = [s for s in top_tree_semantics if s]  # first SEM entry from tuple to list
sqlquery = ' '.join(top_tree_semantics)  # join each list element separated by space

真的很困惑为什么这个失败了(“电影列表”),然而像“显示电影”(在名词和动词之间切换顺序)之类的东西起作用。非常感谢我的愚蠢,欢呼!

1 个答案:

答案 0 :(得分:0)

RHS生产的订购很重要。 NP - &gt; V N; N - >; &#39;电影&#39 ;; V - &gt; &#39;列表&#39;无法解析电影列表,但NP - &gt; N V; N - >; &#39;电影&#39 ;; V - &gt; &#39;列表&#39;可以,或使用|关于RHS的多种变体的RHS