如何在Spark中读取Kafka主题的二进制数据

时间:2018-04-26 01:55:45

标签: java apache-spark apache-kafka spark-streaming

我需要从Kafka主题中读取加密消息。我当前从主题中读取字符串的代码如下所示:

JavaPairReceiverInputDStream<String, String> pairrdd = 
            KafkaUtils.createStream(jssc, zkQuorum, group, topicmap);

如何从kafka队列更改此代码以确保读取的字节数组,加密数据未损坏

1 个答案:

答案 0 :(得分:0)

要以<byte[], byte[]>形式从Kafka读取数据,您可以使用KafkaUtils。像这样 -

Map<String, Object> kafkaParams = new HashMap<>();
kafkaParams.put("bootstrap.servers", "localhost:9092,anotherhost:9092");
kafkaParams.put("key.deserializer", ByteArrayDeserializer.class);
kafkaParams.put("value.deserializer", ByteArrayDeserializer.class);
kafkaParams.put("group.id", "use_a_separate_group_id_for_each_stream");
kafkaParams.put("auto.offset.reset", "latest");
kafkaParams.put("enable.auto.commit", false);

Collection<String> topics = Arrays.asList("topicA", "topicB");

JavaInputDStream<ConsumerRecord<byte[], byte[]>> pairrdd =
  KafkaUtils.createDirectStream(
    jssc,
    LocationStrategies.PreferConsistent(),
    ConsumerStrategies.<byte[], byte[]>Subscribe(topics, kafkaParams)
  );

我希望这有帮助!