如何在scala中减少来自kafka的DStream数据?

时间:2017-11-09 18:58:14

标签: scala

我有一些来自kafka流的DStream数据,我想使用scala减少require格式。 请提供一些提示,以获取给定DStream数据的正确格式。

例如:

case class Test(cust: String, labNo: String, device: String, properties: String, maxValue: Int, count: Int)]

DStream[((cust_1,lab2, switch,cpu,90,2),
         (cust_1,lab2, switch,cpu,80,4),
         (cust_1,lab2, ap,cpu,70,4),
         (cust_1,lab2, switch,mem,70,3),
         (cust_1,lab1, switch,cpu,90,2),
         (cust_1,lab1, switch,cpu,70,4),
         (cust_1,lab1, switch,cpu,80,4),
         (cust_1,lab1, ap,mem,70,3))]

格式(k,v)并应用 Groupbykey(cust,lab)=>

获得以下预期:

(cust_1, lab2), ArrayBuffer((switch, cpu, 90, 2), (switch,cpu,80,4), (ap,cpu,70,4), (switch,mem,70,3))
(cust_1, lab1), ArrayBuffer((switch, cpu 90, 2), (switch,cpu,70,4), (switch,cpu,80,4), (ap,mem,70,3))

ArrayBuffer中的GroupbyKey(设备):

(cust_1, lab12), ArrayBuffer((switch -> ((cpu, 90, 2), (cpu,80,4),(mem,70,3)), ap -> (cpu,70,4)))
(cust_1, lab1), ArrayBuffer((switch -> ((cpu 90, 2), (cpu,70,4), (cpu,80,4)), ap->(mem,70,3))

每个设备的GroupbyKey(属性):

(cust_1, lab12), ArrayBuffer((switch -> (cpu -> ((90, 2), (80,4)), mem ->(70,3))), ap -> (cpu,70,4)))
(cust_1, lab1), ArrayBuffer((switch -> (cpu -> ((90, 2), (70,4), (80,4)))), ap->(mem,70,3)))

我在上面给出了解释要求的例子。该示例未执行,因此请忽略任何Scala输出格式。

所以最终输出如下:

result {
   cust: "xyz"
   labNo: 2
   device {
      switch {
             cpu {
                 maxValue: 80
                 count: 4
                 },
             cpu {
                 maxValue: 90
                 count: 2
                 }
              mem {
                maxValue: 70
                count: 3
                }
      }
      ap {
             cpu {
                 maxValue: 70
                 count: 4
                 }
              mem {
                maxValue: 70
                count: 3
                }
      }

   }
}

0 个答案:

没有答案