引起:kafka.common.OffsetOutOfRangeException

时间:2017-12-06 00:40:18

标签: apache-spark amazon-s3 apache-kafka spark-streaming

我正在使用Kafka和Spark将我的数据更新流式传输到我的Hbase表中。

但是我一直得到OffsetOutOfRangeException,这是我的代码:

                new KafkaStreamBuilder()
                .setStreamingContext(streamingContext)
                .setTopics(topics)
                .setDataSourceId(dataSourceId)
                .setOffsetManager(offsetManager)
                .setConsumerParameters(
                    ImmutableMap
                        .<String, String>builder()
                        .putAll(kafkaConsumerParams)
                        .put("group.id", groupId)
                        .put("metadata.broker.list", kafkaBroker))
                        .build()
                )
                .build()
                .foreachRDD(
                    rdd -> {
                        rdd.foreachPartition(
                            iter -> {
                                final Table hTable = createHbaseTable(settings);
                                try {
                                    while (iter.hasNext()) {
                                        String json = new String(iter.next());
                                        try {
                                            putRow(
                                                hTable,
                                                json,
                                                settings,
                                                barrier);
                                        } catch (Exception e) {
                                            throw new RuntimeException("hbase write failure", e);
                                        }
                                    }
                                } catch (OffsetOutOfRangeException e) {throw new RuntimeException(
                                        "encountered OffsetOutOfRangeException: ", e);
                                }
                            });
                    });

我将流媒体工作设置为每5分钟运行一次,每次,在我的消费者完成一批流媒体后,它会将最新的标记和检查点写入S3。下次在流媒体作业运行之前,它会从S3读取之前的检查点和标记,然后从那里开始。

这是异常堆栈跟踪:

at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:219)
    at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:219)
    at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:925)
    at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:925)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:99)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: kafka.common.OffsetOutOfRangeException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at java.lang.Class.newInstance(Class.java:442)
    at kafka.common.ErrorMapping$.exceptionFor(ErrorMapping.scala:86)
    at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.handleFetchErr(KafkaRDD.scala:188)
    at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.fetchBatch(KafkaRDD.scala:197)
    at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.getNext(KafkaRDD.scala:212)
    at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
    at scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:30)

我做了什么: 我已经检查过,标记和检查点都按预期工作。

所以,我在这里有点迷失,怎么会发生这种异常以及可能的/合理的解决方案呢?

谢谢!

0 个答案:

没有答案