我完全理解如何通过示例JavaStatefulNetworkWordCount使用mapWithMap。但是,我有一个问题。想象一下,我有这条json线
{"device":"dv1","parameter1":"vv1","parameter2":"vv2"}
使用JavaStatefulNetworkWordCount +我的代码来解析json,我可以增加dv1出现的时间。显示的结果例如是
(dv1, 51).
现在,我想将结果包含在json行中以获得输出:
{"device":"dv1","parameter1":"vv1","parameter2":"vv2","increment":51}.
你有想法实现这个结果吗?我不知道如何使用之前的代码制作它。
到目前为止我的代码是:
/**
* Counts words
* To run this on your local machine, you need to first run a Netcat server
* `$ nc -lk 9999`
* and then run the example
* `$ bin/run-example
* org.apache.spark.examples.streaming.JavaStatefulNetworkWordCount localhost 9999`
*/
public class JavaStatefulNetworkWordCount {
private static final Pattern SPACE = Pattern.compile(" ");
public static void main(String[] args) throws Exception {
if (args.length < 2) {
System.err.println("Usage: JavaStatefulNetworkWordCount <hostname> <port>");
System.exit(1);
}
// Create the context with a 1 second batch size
SparkConf sparkConf = new SparkConf().setAppName("test1").setMaster("local[*]");
JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(2));
ssc.checkpoint(".");
// Initial state RDD input to mapWithState
List<Tuple2<String, Integer>> tuples =
Arrays.asList();
JavaPairRDD<String, Integer> initialRDD = ssc.sparkContext().parallelizePairs(tuples);
JavaReceiverInputDStream<String> lines = ssc.socketTextStream(
args[0], Integer.parseInt(args[1]), StorageLevels.MEMORY_AND_DISK_SER_2);
JavaDStream<String> words = lines.map(x -> {
String deviceName = "";
//extract from x, the device name (for instance dv1)
return deviceName;
});
JavaPairDStream<String, Integer> wordsDstream = words.mapToPair(
s -> new Tuple2<>(s, 1));
// Update the cumulative count function
Function3<String, Optional<Integer>, State<Integer>, Tuple2<String, Integer>> mappingFunc =
(word, one, state) -> {
int sum = one.orElse(0) + (state.exists() ? state.get() : 0);
Tuple2<String, Integer> output = new Tuple2<>(word, sum);
state.update(sum);
return output;
};
// DStream made of get cumulative counts that get updated in every batch
JavaMapWithStateDStream<String, Integer, Integer, Tuple2<String, Integer>> stateDstream =
wordsDstream.mapWithState(StateSpec.function(mappingFunc).initialState(initialRDD));
stateDstream.print();
ssc.start();
ssc.awaitTermination();
}
}
提前谢谢你,
Ĵ