Question

（对我从手机中发布时不可避免的格式造成的歉意）

我正在做一个keyBy然后是一个聚合，但是Flink没有正确地对数据进行分组（相反，每个事件都属于它自己的分组）。

示例：

Class Purchase {
    String product;
    Integer quantity;
}

Class Filter {
    String product;

    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + ((product == bull) ? 0 : displayName.hashCode());
}

Class FilteredPurchase {
    Filter filter;
    Purchase purchase;
}

DataStream<FilteredPurchase> =
    ...
    .keyBy(“filter”) //This works
    .keyBy(x -> x.getFilter()). // This doesn’t 
    .sum(“trade.quantity”);

如果我们考虑以下情况下的流情况：

[
    {“filter”: {“product”: null}, “purchase”: {“product”: “apple”, “quantity”: 10},
    {“filter”: {“product”: null}, “purchase”: {“product”: “apple”, “quantity”: 10},
    {“filter”: {“product”: “apple”}, “purchase”: {“product”: “apple”, “quantity”: 10},
    {“filter”: {“product”: “apple”}, “purchase”: {“product”: “apple”, “quantity”: 10},
]

我希望将其键入2个分区（因为有两个过滤器），每个分区总共20个。但是，实际上，我最终得到4个分区，每个分区共10个。

有趣的是，如果我使用字段表达式版本，它确实可以实现我想要的功能，但是我希望将所有内容都保留为POJO，因为我打算稍后对其进行更多处理。

我在这里错过了什么吗？ KeySelector可以返回POJO吗？

Answer 1

第一个问题是为什么您不仅仅使用乘积（字符串）作为键，因为这就是Filter类中的全部内容。所以

.keyBy(x -> x.getProduct())

但是无论如何，我认为您的键类（Filter）必须实现equals()方法。

Flink键通过POJO

1 个答案: