是否有用于处理FCG中未知单词的标准诊断?

时间:2016-08-23 14:30:56

标签: nlp grammar

我有英语的FCG语法,我用词汇表外的单词解析一些文本。此刻,我编写了自己的定制诊断和维修。在最新的FCG版本中,有没有标准的方法来处理未知单词?

1 个答案:

答案 0 :(得分:1)

此时,编写自己的定制诊断和维修确实是最佳解决方案。但是,在下一版FCG中,将包括一个集成诊断和维修库。对于未知单词的单词或多或少会如下所示:

用于检测未知单词的诊断(在创建每个节点后运行)

(defmethod diagnose ((diagnostic diagnose-unknown-words) (node cip-node)
                 &key &allow-other-keys)
"Diagnose that the fully expanded structure contains untreated strings"
(when (fully-expanded? node)
(let ((strings-in-root (get-strings (assoc 'root
                                           (left-pole-structure
                                            (car-resulting-cfs (cipn-car node)))))))
  (when strings-in-root
    (let ((problem (make-instance 'unknown-words)))
      (set-data problem 'strings strings-in-root)
      problem)))))

修复添加新的词法结构(当然非常通用,你需要根据自己的语法自定义):

(defmethod repair ((repair add-lexical-cxn)
               (problem unknown-words)
               (node cip-node)
               &key &allow-other-keys)
"Repair by making a new lexical construction for the first untreated string"
(let ((uw (first (get-data problem 'strings))))
(multiple-value-bind (cxn-set lex-cxn)
    (eval `(def-fcg-cxn ,(make-symbol (upcase (string-append uw "-cxn")))
                        ((?word-unit
                          (args (?ref))
                          (syn-cat (lex-class ?lex-class))
                          (sem-cat (sem-class ?sem-class)))
                         <-
                         (?word-unit
                          (HASH meaning ((,(intern (upcase uw)) ?ref)))
                          --
                          (HASH form ((string ?word-unit ,uw)))))
                        :cxn-inventory ,(copy-object (original-cxn-set (construction-inventory node)))
                        :cxn-set lex))
  (declare (ignore cxn-set))
  (make-instance 'fix
                 :repair repair
                 :problem problem
                 :restart-data lex-cxn))))