我正在将消息从Message Hub传输到Bluemix中的Spark实例。我正在使用Java客户端向Message Hub添加一条简单的json消息。
JSON消息 -
{"country":"Netherlands","dma_code":"0","timezone":"Europe\/Amsterdam","area_code":"0","ip":"46.19.37.108","asn":"AS196752","continent_code":"EU","isp":"Tilaa V.O.F.","longitude":5.75,"latitude":52.5,"country_code":"NL","country_code3":"NLD"}
当我在Spark中开始流式传输时,我收到的消息在开头有一个额外的空值。
(null,{"country":"Netherlands","dma_code":"0","timezone":"Europe\/Amsterdam","area_code":"0","ip":"46.19.37.108","asn":"AS196752","continent_code":"EU","isp":"Tilaa V.O.F.","longitude":5.75,"latitude":52.5,"country_code":"NL","country_code3":"NLD"})
请让我知道为什么Spark上下文会将此null置于前面。我该如何删除它?
KafkaSender代码 -
KafkaProducer<String, String> kafkaProducer;
kafkaProducer = new KafkaProducer<String, String>(props);
ProducerRecord<String, String> producerRecord = new ProducerRecord<String, String>(topic,message);
RecordMetadata recordMetadata = kafkaProducer.send(producerRecord).get();
//getting RecordMetadata is possible to validate topic, partition and offset
System.out.println("topic where message is published : " + recordMetadata.topic());
System.out.println("partition where message is published : " + recordMetadata.partition());
System.out.println("message offset # : " + recordMetadata.offset());
kafkaProducer.close();
由于 拉吉
答案 0 :(得分:0)
您的密钥为空 - 第一个值是您的密钥,第二个值是您的价值。
我建议您发布将消息发布到Kafka / MessageHub的代码以获得更好的答案。
要解决您的问题 - 如果您的目标只是将其打印出来,您可以执行类似的操作,这会将数据打印到stdout并忽略null键。
stream.foreachRDD(recordRDD => {
recordRDD.foreach(record => print(record._2))
})