NiFi ExecuteScript处理器中未在Groovy异常上调用捕获块

时间:2018-10-04 17:41:53

标签: groovy apache-nifi

我正在使用NiFi ExecuteScript调用Groovy脚本,该脚本从PDF提取文本。当提取失败时,应该引发异常,并将流文件重定向到REL_FAILURE。有些PDF可以顺利通过,有些则给出错误:

ExecuteScript[id=9a39e0cb-ebcc-31e4-a169-575e367046e9] Failed to process session due to javax.script.ScriptException: javax.script.ScriptException: java.lang.IllegalStateException: StandardFlowFileRecord[uuid=2d6540f7-b7a2-48c7-8978-6b90bbfb0ff5,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1538596326047-12, container=default, section=12], offset=2134, length=930225],offset=0,name=1  i-9 INS rev 87   05-07-87.pdf,size=930225] already in use for an active callback or an OutputStream created by ProcessSession.write(FlowFile) has not been closed: org.apache.nifi.processor.exception.ProcessException: javax.script.ScriptException: javax.script.ScriptException: java.lang.IllegalStateException: StandardFlowFileRecord[uuid=2d6540f7-b7a2-48c7-8978-6b90bbfb0ff5,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1538596326047-12, container=default, section=12], offset=2134, length=930225],offset=0,name=1  i-9 INS rev 87   05-07-87.pdf,size=930225] already in use for an active callback or an OutputStream created by ProcessSession.write(FlowFile) has not been closed

我的(简化)代码如下:

def flowFile = session.get()
if(!flowFile) return
flowFile = session.write(flowFile, { inputStream, outputStream ->
    try {
        // Load PDF from inputStream and parses text into a JSON string

        // If nothing can be extracted, throw an exception so the flowfile
        // can be routed to REL_FAILURE and processed further down the NiFi pipeline
        if(outputLength < 15) {
            throw new Exception('No output, send to REL_FAILURE')
        }

        // Write the string to the flowFile to be transferred
        outputStream.write(json.getBytes(StandardCharsets.UTF_8))
    } catch (Exception e){
        System.out.println(e.getMessage())
        session.transfer(flowFile, REL_FAILURE)
    }
} as StreamCallback)
session.transfer(flowFile, REL_SUCCESS)

它紧跟cookbook posted in the Hortonworks community forum之后,作者甚至提到关闭是自动处理的。

我认为该错误是由于PDF无法处理而引起的。这将引发异常,应将其捕获在try{}catch{}中,然后将其传输到REL_FAILURE。相反,似乎从未调用过catch{},因此outputStream对象也从未关闭过。当我在NiFi外部运行相同的Groovy代码时,它可以按预期工作并被捕获。

如果您想尝试在自己的服务器上运行它

NiFi template

full Groovy code

Sample PDF

1 个答案:

答案 0 :(得分:1)

try / catch应该在session.write()调用之外,而不是在回调中。在回调内部,抛出IOException而不是Exception,该异常应通过session.write()传播并应在外部输入catch子句。然后,您可以将流文件传输到失败(在写入流文件时,不应允许它传输)。