我正在尝试从java上下文中运行UIMA RUTA脚本,但却遇到异常。
java.lang.ArrayIndexOutOfBoundsException: -1
at org.apache.uima.ruta.parser.RutaParser.emitErrorMessage(RutaParser.java:306)
at org.apache.uima.ruta.parser.RutaParser.reportError(RutaParser.java:327)
at org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:613)
at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
at org.apache.uima.ruta.parser.RutaParser.file_input(RutaParser.java:566)
at org.apache.uima.ruta.engine.RutaEngine.loadScriptIS(RutaEngine.java:939)
深入时没有其他消息我看到了这个异常。
MissingTokenException(inserted [@-1,0:0='<missing EOF>',<-1>,15:0] at CALL)
看起来子脚本正在抛出异常,但同样的脚本在UIMA工作台中提供了正确的输出。
我在这里缺少什么?
ENGINE
<?xml version="1.0" encoding="UTF-8"?>
<analysisEngineDescription xmlns="http://uima.apache.org/resourceSpecifier">
<frameworkImplementation>org.apache.uima.java</frameworkImplementation>
<primitive>true</primitive>
<annotatorImplementationName>org.apache.uima.ruta.engine.RutaEngine</annotatorImplementationName>
<analysisEngineMetaData>
<name>org.test.MainEngine</name>
<description/>
<version>1.0</version>
<vendor/>
<configurationParameters searchStrategy="language_fallback">
<configurationParameter>
<name>seeders</name>
<type>String</type>
<multiValued>true</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>debug</name>
<type>Boolean</type>
<multiValued>false</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>additionalScripts</name>
<type>String</type>
<multiValued>true</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>profile</name>
<type>Boolean</type>
<multiValued>false</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>debugWithMatches</name>
<type>Boolean</type>
<multiValued>false</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>statistics</name>
<type>Boolean</type>
<multiValued>false</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>additionalEngines</name>
<type>String</type>
<multiValued>true</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>additionalExtensions</name>
<type>String</type>
<multiValued>true</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>debugOnlyFor</name>
<type>String</type>
<multiValued>true</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>scriptEncoding</name>
<type>String</type>
<multiValued>false</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>additionalEngineLoaders</name>
<type>String</type>
<multiValued>true</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>resourcePaths</name>
<type>String</type>
<multiValued>true</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>defaultFilteredTypes</name>
<type>String</type>
<multiValued>true</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>mainScript</name>
<type>String</type>
<multiValued>false</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>scriptPaths</name>
<type>String</type>
<multiValued>true</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>descriptorPaths</name>
<type>String</type>
<multiValued>true</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>removeBasics</name>
<type>Boolean</type>
<multiValued>false</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>dynamicAnchoring</name>
<description>Activates dynamic anchoring (possible speed up).</description>
<type>Boolean</type>
<multiValued>false</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>greedyRuleElement</name>
<description>Activates greedy anchoring for rule elements.</description>
<type>Boolean</type>
<multiValued>false</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>greedyRule</name>
<description>Activates greedy anchoring for complete rules.</description>
<type>Boolean</type>
<multiValued>false</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>lowMemoryProfile</name>
<type>Boolean</type>
<multiValued>false</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>createdBy</name>
<type>Boolean</type>
<multiValued>false</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>simpleGreedyForComposed</name>
<type>Boolean</type>
<multiValued>false</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>additionalUimafitEngines</name>
<type>String</type>
<multiValued>true</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>strictImports</name>
<type>Boolean</type>
<multiValued>false</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>varNames</name>
<type>String</type>
<multiValued>true</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>varValues</name>
<type>String</type>
<multiValued>true</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>rules</name>
<type>String</type>
<multiValued>false</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>dictRemoveWS</name>
<type>Boolean</type>
<multiValued>false</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
<configurationParameter>
<name>reindexOnly</name>
<type>String</type>
<multiValued>true</multiValued>
<mandatory>false</mandatory>
</configurationParameter>
</configurationParameters>
<configurationParameterSettings>
<nameValuePair>
<name>debug</name>
<value>
<boolean>false</boolean>
</value>
</nameValuePair>
<nameValuePair>
<name>profile</name>
<value>
<boolean>false</boolean>
</value>
</nameValuePair>
<nameValuePair>
<name>debugWithMatches</name>
<value>
<boolean>true</boolean>
</value>
</nameValuePair>
<nameValuePair>
<name>defaultFilteredTypes</name>
<value>
<array>
<string>org.apache.uima.ruta.type.SPACE</string>
<string>org.apache.uima.ruta.type.BREAK</string>
<string>org.apache.uima.ruta.type.MARKUP</string>
</array>
</value>
</nameValuePair>
<nameValuePair>
<name>removeBasics</name>
<value>
<boolean>false</boolean>
</value>
</nameValuePair>
<nameValuePair>
<name>seeders</name>
<value>
<array>
<string>org.apache.uima.ruta.seed.DefaultSeeder</string>
</array>
</value>
</nameValuePair>
<nameValuePair>
<name>createdBy</name>
<value>
<boolean>false</boolean>
</value>
</nameValuePair>
<nameValuePair>
<name>mainScript</name>
<value>
<string>org.test.Main</string>
</value>
</nameValuePair>
<nameValuePair>
<name>scriptPaths</name>
<value>
<array>
<string>../../scripts</string>
</array>
</value>
</nameValuePair>
<nameValuePair>
<name>descriptorPaths</name>
<value>
<array>
<string>../descriptor</string>
</array>
</value>
</nameValuePair>
<nameValuePair>
<name>resourcePaths</name>
<value>
<array>
<string>/Users/Gaurav/Documents/workspace/Paragraph/resources</string>
</array>
</value>
</nameValuePair>
<nameValuePair>
<name>additionalScripts</name>
<value>
<array>
<string>org.test.Email</string>
<!--<string>org.test.number.Number</string>
<string>org.test.Date</string>-->
<!--<string>org.test.USAAddress</string>
<string>org.test.Name</string>
<string>org.test.number.PhoneNumber</string>-->
</array>
</value>
</nameValuePair>
<nameValuePair>
<name>additionalEngines</name>
<value>
<array/>
</value>
</nameValuePair>
<nameValuePair>
<name>additionalUimafitEngines</name>
<value>
<array/>
</value>
</nameValuePair>
<nameValuePair>
<name>additionalExtensions</name>
<value>
<array>
<string>org.apache.uima.ruta.string.bool.BooleanOperationsExtension</string>
<string>org.apache.uima.ruta.string.StringOperationsExtension</string>
<string>org.apache.uima.ruta.block.OnlyFirstBlockExtension</string>
<string>org.apache.uima.ruta.block.OnlyOnceBlockExtension</string>
<string>org.apache.uima.ruta.block.fst.FSTBlockExtension</string>
</array>
</value>
</nameValuePair>
<nameValuePair>
<name>additionalEngineLoaders</name>
<value>
<array/>
</value>
</nameValuePair>
</configurationParameterSettings>
<typeSystemDescription>
<imports>
<import location="MainTypeSystem.xml"/>
</imports>
</typeSystemDescription>
<typePriorities>
<priorityList>
<type>org.apache.uima.ruta.type.RutaFrame</type>
<type>uima.tcas.Annotation</type>
<type>org.apache.uima.ruta.type.RutaBasic</type>
</priorityList>
</typePriorities>
<fsIndexCollection/>
<capabilities>
<capability>
<inputs/>
<outputs/>
<languagesSupported/>
</capability>
<capability>
<inputs>
<type>org.test.Main.Filters</type>
</inputs>
<outputs>
<type>org.test.Main.Filters</type>
</outputs>
<languagesSupported/>
</capability>
</capabilities>
<operationalProperties>
<modifiesCas>true</modifiesCas>
<multipleDeploymentAllowed>true</multipleDeploymentAllowed>
<outputsNewCASes>true</outputsNewCASes>
</operationalProperties>
</analysisEngineMetaData>
<resourceManagerConfiguration/>
</analysisEngineDescription>
脚本
PACKAGE org.test;
SCRIPT org.test.Email;
Document{->LOG("starting Processed")};
WORDLIST FiltersList = 'test/dictionaries/ValueFilters.txt';
DECLARE Filters;
DocumentAnnotation{-> MARKFAST(Filters, FiltersList)};
CALL(Email);
Document{-> ADDRETAINTYPE(MARKUP)};
答案 0 :(得分:2)
修复了问题,Ruta子脚本调用存在问题。
我们必须使用以下语法来调用下标
DocumentAnnotation{->CALL(Email)};
而不是
CALL(Email);