为什么用spark rdd写kafka主题不起作用

时间:2016-06-17 11:55:50

标签: apache-spark apache-kafka

为什么不从rdd写入kafka主题的消息。我不知道我做错了什么。所以基本上我想收到一条消息,做一些过滤并将其写入另一个kafka主题。我应该如何用火花RDD初始化kafka制作人?

PS:我想用Kafka自己的API做到这一点,而不是像cloudera的spark kafka作家那样的其他API。

JavaPairInputDStream<byte[], byte[]> directKafkaStream = KafkaUtils.createDirectStream(ssc, byte[].class,
            byte[].class, DefaultDecoder.class, DefaultDecoder.class, kafkaParams, topics);
    directKafkaStream.foreachRDD(rdd -> {
        rdd.foreach(record -> {
            KafkaProducer<String, String> producer = new KafkaProducer<String, String>(kafkaProps);
            Schema.Parser parser = new Schema.Parser();
            Schema schema = parser.parse(USER_SCHEMA);
            DatumReader<GenericRecord> reader = new SpecificDatumReader<GenericRecord>(schema);
            Decoder decoderRec1 = DecoderFactory.get().binaryDecoder(record._1, null);
            Decoder decoderRec2 = DecoderFactory.get().binaryDecoder(record._2, null);
            GenericRecord messageValue = null;
            GenericRecord messageKey = null;
            try {
                messageValue = reader.read(null, decoderRec2);
                messageKey = reader.read(null, decoderRec1);
                ProducerRecord<String, String> record1 = new ProducerRecord<>("outputTopic", "myKey",
                        "Key: " + messageKey + " Value" + messageValue.toString());
                Future<RecordMetadata> sent = producer.send(record1);
                sent.get();
            } catch (Exception e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
            System.out.println("Received: " + messageValue + " and key: " + messageKey);
            producer.close();
        });
    });
    ssc.start();
    ssc.awaitTermination();

这是工作示例

JavaDStream<InputMessage> inputMessageStream = directKafkaStream.map(avroRecord -> InputMessageTranslator.decodeAvro(avroRecord._2));
    inputMessageStream.foreachRDD(rdd -> {
        rdd.foreach(message -> {
            KafkaProducer<String, String> producer = new KafkaProducer<String, String>(kafkaProps);
            ProducerRecord<String, String> record = new ProducerRecord<>(startProps.getProperty(InputPropertyKey.OUTPUT_TOPIC.toString()),
                    "myKey", "" + message.getSessionStartTime());
            producer.send(record);
            producer.close();
        });
    });

0 个答案:

没有答案