我已经升级了我的kafka和kafka-spark流媒体,但我在改变它们的一些方法时面临一些挑战。就像KafkaUtils抛出错误一样,Iterator也会抛出错误。我的卡夫卡版本是0.10.1.1。 所以如果有人知道如何解决这些变化会很棒。 感谢
答案 0 :(得分:0)
KafkaUtils是Apache Spark Streaming的一部分,不是Apache Kafka的一部分
org.apache.spark.streaming.kafka.KafkaUtils
答案 1 :(得分:0)
以前的KafkaUtils包是" org.apache.spark.streaming.kafka"。最新的软件包是" org.apache.spark.streaming.kafka010"。
要设置kafkaparams和主题详细信息,请查看以下代码段
import java.util.*;
import org.apache.spark.SparkConf;
import org.apache.spark.TaskContext;
import org.apache.spark.api.java.*;
import org.apache.spark.api.java.function.*;
import org.apache.spark.streaming.api.java.*;
import org.apache.spark.streaming.kafka010.*;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.common.TopicPartition;
import org.apache.kafka.common.serialization.StringDeserializer;
import scala.Tuple2;
Map<String, Object> kafkaParams = new HashMap<>();
kafkaParams.put("bootstrap.servers", "localhost:9092,anotherhost:9092");
kafkaParams.put("key.deserializer", StringDeserializer.class);
kafkaParams.put("value.deserializer", StringDeserializer.class);
kafkaParams.put("group.id", "use_a_separate_group_id_for_each_stream");
kafkaParams.put("auto.offset.reset", "latest");
kafkaParams.put("enable.auto.commit", false);
Collection<String> topics = Arrays.asList("topicA", "topicB");
final JavaInputDStream<ConsumerRecord<String, String>> stream =
KafkaUtils.createDirectStream(
streamingContext,
LocationStrategies.PreferConsistent(),
ConsumerStrategies.<String, String>Subscribe(topics, kafkaParams)
);
stream.mapToPair(
new PairFunction<ConsumerRecord<String, String>, String, String>() {
@Override
public Tuple2<String, String> call(ConsumerRecord<String, String> record) {
return new Tuple2<>(record.key(), record.value());
}
})
如需进一步参考,请访问以下链接https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html