您好我正在尝试将Kafka与Spark流式整合。
我想在JavaDStream中找到foreachRDD的消息计数。
请找到以下代码并给我一些建议。
public class App {
@SuppressWarnings("serial")
public static void main(String[] args) throws Exception{
SparkConf conf = new SparkConf()
.setAppName("Streamingkafka")
.setMaster("local[*]");
JavaSparkContext sc = new JavaSparkContext(conf);
JavaStreamingContext ssc = new JavaStreamingContext(sc, new Duration(1000));
Map<String, String> kafkaParams = new HashMap<String, String>();
kafkaParams.put("metadata.broker.list", "localhost:9092");
Set<String> topics = Collections.singleton("data_one");
JavaPairInputDStream<String,String> directKafkaStream = KafkaUtils.createDirectStream(ssc,String.class, String.class, StringDecoder.class, StringDecoder.class, kafkaParams, topics);
JavaDStream<String> msgDataStream = directKafkaStream.map(new Function<Tuple2<String, String>, String>() {
@Override
public String call(Tuple2<String, String> tuple2) {
return tuple2._2();
}
});
msgDataStream.print();
msgDataStream.count();
ssc.start();
ssc.awaitTermination();
}
}
提前致谢。
答案 0 :(得分:1)
JavaDStream<String> msgDataStream = directKafkaStream.map(new Function<Tuple2<String, String>, String>() {
@Override
public String call(Tuple2<String, String> tuple2) {
return tuple2._2();
}
});
msgDataStream.foreachRDD(x -> System.out.println(x.count()));
ssc.start();
ssc.awaitTermination();
我正在以lambda方式进行foreachRDD。如果您使用的是以前版本的java,请使用下面的foreach代码。
msgDataStream.foreachRDD(new VoidFunction<JavaRDD<String>>() {
@Override
public void call(JavaRDD<String> arg0) throws Exception {
System.out.println(arg0.count());
}
}
);