在GATE中用多行注释句子

时间:2016-08-11 09:15:59

标签: nlp gate java-annotations

我在GATE中遇到Sentence Splitter模块的问题。我的文字是这样的:

Social history. He drank a lot in his young age. He did
not attend a school. He was depressed of his condition.

虽然我们确定句子应该像

一样分开
Sentence 1: Social history.
Sentence 2: He drank a lot in his young age.
Sentence 3: He did not attend a school.
Sentence 4: He was depressed of his condition.

ANNIE Sentence Splitter认识到不同行中的文本应该用不同的句子分组,从而得出结论:

Sentence 1: Social history.
Sentence 2: He drank a lot in his young age.
Sentence 3: He did 
Sentence 4: not attend a school.
Sentence 5: He was depressed of his condition.

这是因为句子分成多行。有没有办法告诉句子分割器,句子可能会出现多行?或者有没有更好的方法来识别这类文本中的句子?

谢谢:)

1 个答案:

答案 0 :(得分:4)

尝试使用RegEx Sentence Splitter而不是Annie。

使用ANNIE Sentence Splitter,你有参数TransducerURL,它默认指向:

  

/PATH-TO-GATE/plugins/ANNIE/resources/sentenceSplitter/grammar/main-single-nl.jape

在此文件夹中还有一个名为:

的jape文件
  

/PATH-TO-GATE/plugins/ANNIE/resources/sentenceSplitter/grammar/main.jape

如果你改变它应该有效。