UIMA RUTA中的Array IndexOutOfBound异常

时间:2017-04-03 14:11:54

标签: java exception uima ruta

我正在尝试从java上下文中运行UIMA RUTA脚本,但却遇到异常。

java.lang.ArrayIndexOutOfBoundsException: -1
    at org.apache.uima.ruta.parser.RutaParser.emitErrorMessage(RutaParser.java:306)
    at org.apache.uima.ruta.parser.RutaParser.reportError(RutaParser.java:327)
    at org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:613)
    at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
    at org.apache.uima.ruta.parser.RutaParser.file_input(RutaParser.java:566)
    at org.apache.uima.ruta.engine.RutaEngine.loadScriptIS(RutaEngine.java:939)

深入时没有其他消息我看到了这个异常。

MissingTokenException(inserted [@-1,0:0='<missing EOF>',<-1>,15:0] at CALL)

看起来子脚本正在抛出异常,但同样的脚本在UIMA工作台中提供了正确的输出。

我在这里缺少什么?

ENGINE

<?xml version="1.0" encoding="UTF-8"?>
<analysisEngineDescription xmlns="http://uima.apache.org/resourceSpecifier">
    <frameworkImplementation>org.apache.uima.java</frameworkImplementation>
    <primitive>true</primitive>
    <annotatorImplementationName>org.apache.uima.ruta.engine.RutaEngine</annotatorImplementationName>
    <analysisEngineMetaData>
        <name>org.test.MainEngine</name>
        <description/>
        <version>1.0</version>
        <vendor/>
        <configurationParameters searchStrategy="language_fallback">
            <configurationParameter>
                <name>seeders</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>debug</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>additionalScripts</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>profile</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>debugWithMatches</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>statistics</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>additionalEngines</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>additionalExtensions</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>debugOnlyFor</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>scriptEncoding</name>
                <type>String</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>additionalEngineLoaders</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>resourcePaths</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>defaultFilteredTypes</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>mainScript</name>
                <type>String</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>scriptPaths</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>descriptorPaths</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>removeBasics</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>dynamicAnchoring</name>
                <description>Activates dynamic anchoring (possible speed up).</description>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>greedyRuleElement</name>
                <description>Activates greedy anchoring for rule elements.</description>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>greedyRule</name>
                <description>Activates greedy anchoring for complete rules.</description>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>lowMemoryProfile</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>createdBy</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>simpleGreedyForComposed</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>additionalUimafitEngines</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>strictImports</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>varNames</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>varValues</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>rules</name>
                <type>String</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>dictRemoveWS</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>reindexOnly</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
        </configurationParameters>
        <configurationParameterSettings>
            <nameValuePair>
                <name>debug</name>
                <value>
                    <boolean>false</boolean>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>profile</name>
                <value>
                    <boolean>false</boolean>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>debugWithMatches</name>
                <value>
                    <boolean>true</boolean>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>defaultFilteredTypes</name>
                <value>
                    <array>
                        <string>org.apache.uima.ruta.type.SPACE</string>
                        <string>org.apache.uima.ruta.type.BREAK</string>
                        <string>org.apache.uima.ruta.type.MARKUP</string>
                    </array>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>removeBasics</name>
                <value>
                    <boolean>false</boolean>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>seeders</name>
                <value>
                    <array>
                        <string>org.apache.uima.ruta.seed.DefaultSeeder</string>
                    </array>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>createdBy</name>
                <value>
                    <boolean>false</boolean>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>mainScript</name>
                <value>
                    <string>org.test.Main</string>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>scriptPaths</name>
                <value>
                    <array>
                        <string>../../scripts</string>
                    </array>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>descriptorPaths</name>
                <value>
                    <array>
                        <string>../descriptor</string>
                    </array>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>resourcePaths</name>
                <value>
                    <array>
                        <string>/Users/Gaurav/Documents/workspace/Paragraph/resources</string>
                    </array>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>additionalScripts</name>
                <value>
                    <array>
                     <string>org.test.Email</string>
                        <!--<string>org.test.number.Number</string>
                        <string>org.test.Date</string>-->
                        <!--<string>org.test.USAAddress</string>
                        <string>org.test.Name</string>
                        <string>org.test.number.PhoneNumber</string>-->
                    </array>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>additionalEngines</name>
                <value>
                    <array/>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>additionalUimafitEngines</name>
                <value>
                    <array/>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>additionalExtensions</name>
                <value>
                    <array>
                        <string>org.apache.uima.ruta.string.bool.BooleanOperationsExtension</string>
                        <string>org.apache.uima.ruta.string.StringOperationsExtension</string>
                        <string>org.apache.uima.ruta.block.OnlyFirstBlockExtension</string>
                        <string>org.apache.uima.ruta.block.OnlyOnceBlockExtension</string>
                        <string>org.apache.uima.ruta.block.fst.FSTBlockExtension</string>
                    </array>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>additionalEngineLoaders</name>
                <value>
                    <array/>
                </value>
            </nameValuePair>
        </configurationParameterSettings>
        <typeSystemDescription>
            <imports>
                <import location="MainTypeSystem.xml"/>
            </imports>
        </typeSystemDescription>
        <typePriorities>
            <priorityList>
                <type>org.apache.uima.ruta.type.RutaFrame</type>
                <type>uima.tcas.Annotation</type>
                <type>org.apache.uima.ruta.type.RutaBasic</type>
            </priorityList>
        </typePriorities>
        <fsIndexCollection/>
        <capabilities>
            <capability>
                <inputs/>
                <outputs/>
                <languagesSupported/>
            </capability>
            <capability>
                <inputs>
                    <type>org.test.Main.Filters</type>
                </inputs>
                <outputs>
                    <type>org.test.Main.Filters</type>
                </outputs>
                <languagesSupported/>
            </capability>
        </capabilities>
        <operationalProperties>
            <modifiesCas>true</modifiesCas>
            <multipleDeploymentAllowed>true</multipleDeploymentAllowed>
            <outputsNewCASes>true</outputsNewCASes>
        </operationalProperties>
    </analysisEngineMetaData>
    <resourceManagerConfiguration/>
</analysisEngineDescription>

脚本

PACKAGE org.test;

SCRIPT org.test.Email;

Document{->LOG("starting Processed")};
WORDLIST FiltersList = 'test/dictionaries/ValueFilters.txt';
DECLARE Filters;
DocumentAnnotation{-> MARKFAST(Filters, FiltersList)};


CALL(Email);

Document{-> ADDRETAINTYPE(MARKUP)};

1 个答案:

答案 0 :(得分:2)

修复了问题,Ruta子脚本调用存在问题。

我们必须使用以下语法来调用下标

DocumentAnnotation{->CALL(Email)};

而不是

CALL(Email);