将列表添加到阿拉伯语插件地名词典时加载列表时出错

时间:2016-12-30 15:31:34

标签: nlp named-entity-recognition gate

我试图在阿拉伯语插件地址词典中添加一个新列表。 我按照以下步骤操作:

  1. 创建一个新文件“tags.lst”到目录“GATE_Developer_8.1 \ plugins \ Lang_Arabic \ resources \ gazetteer \”
  2. 将“lists.def”文件附加到:“tags.lst:tags :: arabic”
  3. 启动登机口软件时,弹出一个窗口,显示以下消息:

      

    无法创建资源!

         

    gate.creole.ResourceInstantiationException:   gate.util.GateRuntimeException:加载列表时出错:tags.lst:   java.io.IOException:系统找不到指定的路径。

    以下是完整的例外情况:

    gate.creole.ResourceInstantiationException: gate.util.GateRuntimeException: Error loading list: tags.lst: java.io.IOException: The system cannot find the path specified
        at gate.creole.gazetteer.LinearDefinition.load(LinearDefinition.java:281)
        at gate.creole.gazetteer.DefaultGazetteer.init(DefaultGazetteer.java:119)
        at gate.Factory.createResource(Factory.java:432)
        at gate.gui.NewResourceDialog$4.run(NewResourceDialog.java:257)
        at java.lang.Thread.run(Thread.java:745)
    Caused by: gate.util.GateRuntimeException: Error loading list: tags.lst: java.io.IOException: The system cannot find the path specified
        at gate.creole.gazetteer.LinearDefinition.add(LinearDefinition.java:527)
        at gate.creole.gazetteer.LinearDefinition.load(LinearDefinition.java:276)
        ... 4 more
    Caused by: gate.creole.ResourceInstantiationException: java.io.IOException: The system cannot find the path specified
        at gate.creole.gazetteer.LinearDefinition.loadSingleList(LinearDefinition.java:199)
        at gate.creole.gazetteer.LinearDefinition.loadSingleList(LinearDefinition.java:158)
        at gate.creole.gazetteer.LinearDefinition.add(LinearDefinition.java:520)
        ... 5 more
    Caused by: java.io.IOException: The system cannot find the path specified
        at java.io.WinNTFileSystem.createFileExclusively(Native Method)
        at java.io.File.createNewFile(File.java:1012)
        at gate.creole.gazetteer.LinearDefinition.loadSingleList(LinearDefinition.java:188)
        ... 7 more
    

    我会感激任何帮助吗?

1 个答案:

答案 0 :(得分:1)

问题是由于两个主要问题:

  1. 文件未正确保存为使用在线转换器解决的utf-8编码:http://www.motobit.com/util/charset-codepage-conversion.asp

  2. 该文件包含使用以下replaceAll正则表达式解析的特殊字符[#|" |:]:

  3.   

    line = line.replaceAll("[#|\"|:]", " ");