全局处理数据管道的异常

时间:2019-01-17 14:01:40

标签: apache-flink

我有5个不同任务的数据管道。如果任何任务中有任何异常,则将其移至错误kafka主题。是否有任何异常处理程序钩子

1 个答案:

答案 0 :(得分:1)

我建议使用Flink的side output functionality收集异常,然后将其输出到Kafka主题。

final OutputTag<String> outputTag = new OutputTag<String>("side-output"){};
SingleOutputStreamOperator<Integer> task1 = ...;
SingleOutputStreamOperator<Integer> task2 = ...;
SingleOutputStreamOperator<Integer> task3 = ...;
DataStream<String> exceptions1 = task1.getSideOutput(outputTag);
DataStream<String> exceptions2 = task2.getSideOutput(outputTag);
DataStream<String> exceptions3 = task3.getSideOutput(outputTag);

DataStream<String> exceptions = exceptions1.union(exceptions2, exceptions3);
exceptions.addSink(new FlinkKafkaProducer(...));

更新

您还可以将结果包装到Left类型的Right中,并将异常包装在Either类型的split/select中。在管道的最后,您需要通过DataStream<Either<Payload, Exception>> stage2 = stage1.flatMap(...); DataStream<Either<Payload2, Exception>> stage3 = stage2.flatMap((Either<Payload, Exception> payload, Collector out) -> { if (payload.isLeft()) { out.collect(Left.of(map(payload.left))); } else { out.collect(Right.of(payload.right())); } }); SplitStream<Either<Payload2, Exception>> split = stage3.split((Either<Payload2, Exception> value) -> { if (value.isLeft()) { return Colletions.singleton("left"); } else { return Collections.singleton("right"); } }); DataStream<Either<Payload2, Exception>> payloads = split.select("left"); DataStream<Either<Payload2, Exception>> exceptions = split.select("right"); 函数将流分成有效负载和异常。

.xlsx