我有一些来自kafka流的DStream数据,我想使用scala减少require格式。 请提供一些提示,以获取给定DStream数据的正确格式。
例如:
case class Test(cust: String, labNo: String, device: String, properties: String, maxValue: Int, count: Int)]
DStream[((cust_1,lab2, switch,cpu,90,2),
(cust_1,lab2, switch,cpu,80,4),
(cust_1,lab2, ap,cpu,70,4),
(cust_1,lab2, switch,mem,70,3),
(cust_1,lab1, switch,cpu,90,2),
(cust_1,lab1, switch,cpu,70,4),
(cust_1,lab1, switch,cpu,80,4),
(cust_1,lab1, ap,mem,70,3))]
格式(k,v)并应用 Groupbykey(cust,lab)=>
获得以下预期:
(cust_1, lab2), ArrayBuffer((switch, cpu, 90, 2), (switch,cpu,80,4), (ap,cpu,70,4), (switch,mem,70,3))
(cust_1, lab1), ArrayBuffer((switch, cpu 90, 2), (switch,cpu,70,4), (switch,cpu,80,4), (ap,mem,70,3))
ArrayBuffer中的GroupbyKey(设备):
(cust_1, lab12), ArrayBuffer((switch -> ((cpu, 90, 2), (cpu,80,4),(mem,70,3)), ap -> (cpu,70,4)))
(cust_1, lab1), ArrayBuffer((switch -> ((cpu 90, 2), (cpu,70,4), (cpu,80,4)), ap->(mem,70,3))
每个设备的GroupbyKey(属性):
(cust_1, lab12), ArrayBuffer((switch -> (cpu -> ((90, 2), (80,4)), mem ->(70,3))), ap -> (cpu,70,4)))
(cust_1, lab1), ArrayBuffer((switch -> (cpu -> ((90, 2), (70,4), (80,4)))), ap->(mem,70,3)))
我在上面给出了解释要求的例子。该示例未执行,因此请忽略任何Scala输出格式。
所以最终输出如下:
result {
cust: "xyz"
labNo: 2
device {
switch {
cpu {
maxValue: 80
count: 4
},
cpu {
maxValue: 90
count: 2
}
mem {
maxValue: 70
count: 3
}
}
ap {
cpu {
maxValue: 70
count: 4
}
mem {
maxValue: 70
count: 3
}
}
}
}