我有一个SingleOutputStreamOperator
,我要对其进行一些处理,并且需要对其进行多个keyBy()
。
这是示例代码:
public SingleOutputStreamOperator<Map<String, Object>> process(DataStreamSource<Map<String, Object>> stream) {
BroadcastStream<Map<String,Object>> broadcastedStream = ...;
return stream
.assignTimestampsAndWatermarks(...)
.keyBy(new MyKeySelector("fieldAAA"))
.window(SlidingEventTimeWindows.of(Time.seconds(30), Time.seconds(10)))
.aggregate(new MyAggregationFunction1())
.keyBy(new MapKeySelector("fieldAAA"))
.connect(broadcastedStream)
.process(new MyEvaluator()) // <-- 'fieldBBB' is built here
.keyBy(new MyKeySelector("fieldBBB"))
.window(SlidingEventTimeWindows.of(Time.seconds(30), Time.seconds(10)))
.aggregate(new MyAggregationFunction2());
}
但是出现以下错误:
2019-10-10 10:40:07.664 INFO org.apache.flink.runtime.taskmanager.Task - Co-Process-Broadcast-Keyed (6/12) (50ed3ab36b8f4078d865b7026cab08e5) switched from RUNNING to FAILED.
java.lang.RuntimeException: null
at org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:110)
at org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:89)
at org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:45)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696)
at org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51)
at my.package.processElement(MyEvaluator.java:54)
at my.package.processElement(MyEvaluator.java:22)
at org.apache.flink.streaming.api.operators.co.CoBroadcastWithKeyedOperator.processElement1(CoBroadcastWithKeyedOperator.java:113)
at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:238)
at org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:117)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.NullPointerException: null
at org.apache.flink.runtime.state.KeyGroupRangeAssignment.assignToKeyGroup(KeyGroupRangeAssignment.java:59)
at org.apache.flink.runtime.state.KeyGroupRangeAssignment.assignKeyToParallelOperator(KeyGroupRangeAssignment.java:48)
at org.apache.flink.streaming.runtime.partitioner.KeyGroupStreamPartitioner.selectChannel(KeyGroupStreamPartitioner.java:58)
at org.apache.flink.streaming.runtime.partitioner.KeyGroupStreamPartitioner.selectChannel(KeyGroupStreamPartitioner.java:32)
at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:128)
at org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:107)
... 13 common frames omitted
从堆栈跟踪看,错误似乎在processElement
(KeyedBroadcastProcessFunction)的MyEvaluator
函数内部:
@Override
public void processElement(Map<String, Object> value, ReadOnlyContext ctx, Collector<Map<String, Object>> out) throws Exception {
List<Map<String, Object>> newFieldsList = ... ; // Retrieve a list of new fields based on 'value' and elements received using the broadcastedStream
for(Map<String, Object> newFields : newFieldsList){
value.putAll(newFields); // Add all newFields to current value
out.collect(value); // <-- NPE occur here
}
}
但是在我的处理过程中,如果我删除带有keyBy("fieldBBB")
的零件,则代码将运行。更有趣的是,如果我将keyBy("fieldBBB")
替换为keyBy("fieldAAA")
,则代码将运行。
您如何解释这种行为,我该怎么做?