when i try to run this code
Map<String, Object> kafkaParams = new HashMap<>();
kafkaParams.put("bootstrap.servers", "localhost:9092");
kafkaParams.put("key.deserializer", StringDeserializer.class);
kafkaParams.put("value.deserializer", StringDeserializer.class);
kafkaParams.put("group.id", "use_a_separate_group_id_for_each_stream");
kafkaParams.put("auto.offset.reset", "latest");
kafkaParams.put("enable.auto.commit", false);
Collection<String> topics = Arrays.asList("topicA", "topicB");
JavaInputDStream<ConsumerRecord<String, String>> stream =
KafkaUtils.createDirectStream(
ssc,
LocationStrategies.PreferConsistent(),
ConsumerStrategies.<String, String>Subscribe(topics, kafkaParams)
);
stream.mapToPair(record -> new Tuple2<>(record.key(), record.value()));
i always have the message:
2018-07-25 11:10:26 WARN KafkaUtils:66 - overriding auto.offset.reset to none for executor
i analyze the code and notice that the method fixKafkaParams will always run. How to solve this problem
答案 0 :(得分:0)
您应该自己管理偏移量。如果Spark管理偏移量,则它将覆盖此值。因为如果auto.offset.reset设置为最新,则Spark Job会尝试对每个批次执行此操作。但是相反,Spark必须从offset读取消息。我们无法更改。