Question

我想创建一个条形图，显示图像中有多少像素颜色;图像每3秒更新一次，因此我的条形图也会更新。

我有一个主题，收集JSON对象，它的关键是图像创建日期，值是十六进制值（例如#FFF）。

我希望按键分组，因此它按图像分组，然后按每个组的十六进制值分组并执行.count（）。

你是怎么做到的？

我在考虑streams.groupByKey（）...然后groupBy的十六进制值，但我需要将KTable转换为KStream ......

更新

很抱歉我在手机上输入时没有解释。我打算再试一次。

顺便说一下，我改变了一些东西。如果你想阅读我正在做的事情，这是我的github：https://github.com/Lilmortal。

我的项目“HexGraph-source-connector”可以获取任何图像指定目录并将图像路径推送到主题。
“HexGraph”项目捡起来，使用Akka，演员们会得到所有像素十六进制代码分别开始推送消息另一个话题。
“HexGraph-stream”是我的kafka流部分。

但是很长，我怀疑你会读它吗。

无论如何，我从一个主题中读到，我收到了这样的消息{imagePath：{hexCode：#fff}}。图像路径是关键，hexCode是值。我可以拥有一对多的imagePaths，所以我的想法是我的前端会有一个websocket来拾取它。它将显示一个图像，顶部有一个条形图，它具有像素颜色代码的数量。例如有4个#fff，28个#fef等

因此我想通过imagePath进行分组，然后我想计算该imagePath的每个像素。

例如：

{imagePath1：{hexCode：#fff，count：47}}
{imagePath1：{hexCode：＃fef，count：61}}
{imagePath2：{hexCode：#fff，count：23}}
{imagePath2：{hexCode：＃fef，count：55}}

所以这里imagePath1有47 #fff，而imagePath2有23 #fff。

这就是我想要做的事情。

Answer 1

也许在分组前用复合键选择？像这样：

SteamsBuilder topology = new StreamsBuilder();

topology.stream("input")
   .selectKey((k, v) -> k + v.hex)
   .groupByKey()
   .count()

这不是groupBy两次，但可以获得所需的效果。

评论后

更新：

class Image {
    public String imagePath;
}

class ImageAggregation {
    public String imagePath;
    public int count;
}

class ImageSerde implements Serde<Image> {
    // implement
}

class ImageAggregationSerde implements Serde<ImageAggregation> {
    // implement   
}

KTable<String, ImageAggregation> table = topology
  .stream("input", Consumed.with(new org.apache.kafka.common.serialization.Serdes.LongSerde(), new ImageSerde()))
  .groupBy((k, v) -> v.imagePath)
  .aggregate(ImageAggregation::new,
             (k, v, agg) -> {
                 agg.imagePath = v.imagePath;
                 agg.count = agg.count + 1;
                 return agg;
             }, Materialized.with(new org.apache.kafka.common.serialization.Serdes.StringSerde(), new ImageAggregationSerde());

发布更新后

更新2 ：

class ImageHex {
    public String imagePath;
    public String hex;
}

class ImageHexAggregation {
    public String imagePath;
    public Map<String, Integer> counts;
}

class ImageHexSerde implements Serde<ImageHex> {
    // implement
}

class ImageHexAggregationSerde implements Serde<ImageHexAggregation> {
    // implement   
}

KTable<String, ImageHexAggregation> table = topology
  .stream("image-hex-observations", Consumed.with(new org.apache.kafka.common.serialization.Serdes.LongSerde(), new ImageSerde()))
  .groupBy((k, v) -> v.imagePath)
  .aggregate(ImageHexAggregation::new,
             (k, v, agg) -> {
                 agg.imagePath = v.imagePath;
                 Integer currentCount = agg.counts.getOrDefault(v.hex, 0)
                 agg.counts.put(v.hex, currentCount + 1));
                 return agg;
             }, Materialized.with(new org.apache.kafka.common.serialization.Serdes.StringSerde(), new ImageHexAggregationSerde());

卡夫卡流 - 如何分组两次？

1 个答案: