Question

KeyedStream＃max（字符串字段）

应用聚合以提供当前最大数据量在给定的字段表达式中通过给定的键流。一个独立的每个键都保留聚合。字段表达式是a的名称公共字段或带有{@link括号的getter方法 DataStream}的基础类型。点可用于向下钻取对象，例如{@code“ field1.fieldxy”}。

KeyedStream＃maxBy（字符串字段）

应用使当前元素具有最大值的聚合给定键在给定位置的值。独立的集合保留每个键。如果更多元素在给定值处具有最大值位置，操作员默认返回第一个。

这两个API的javadoc看起来非常相似，我想问一下两者之间有什么区别，以及何时选择一个或那个

Answer 1

在详细研究实现之前，我也无法告诉您差异。

假设您的POJO记录的模式为（a：字符串，b：字符串，c：字符串）。

max（字符串字段）

keyedStream.maxBy（“ a”），对于每个键，返回第一个记录，该记录的字段“ a”被每个键的最大值“ a”替换。

maxBy（字符串字段）

keyedStream.max（“ a”）返回最大字段为“ a”的记录（如果有多个记录，则获取第一个记录。）

有关更多信息，您可以检查ComparableAggregator.java。

Answer 2

max和maxBy之间的difference是max返回最大值，而maxBy返回该字段中具有最大值的元素。

 keyedStream.max(0);
 keyedStream.max("key");
 keyedStream.maxBy(0);
 keyedStream.maxBy("key");

在以下示例中，我们还可以看到区别：

使用max：

  // Create a Tumbling Window with the values of 1 day:
            .timeWindow(Time.of(1, TimeUnit.DAYS))
            // Use the max Temperature of the day:
            .max("temperature")
            // And perform an Identity map, because we want to write all values of this day to the Database:
            .map(new MapFunction<elastic.model.LocalWeatherData, elastic.model.LocalWeatherData>() {
                @Override
                public elastic.model.LocalWeatherData map(elastic.model.LocalWeatherData localWeatherData) throws Exception {
                    return localWeatherData;
                }
            });

使用maxBy：

  // Now take the Maximum Temperature per day from the KeyedStream:
    DataStream<LocalWeatherData> maxTemperaturePerDay =
            localWeatherDataByStation
                    // Use non-overlapping tumbling window with 1 day length:
                    .timeWindow(Time.days(1))
                    // And use the maximum temperature:
                    .maxBy("temperature");

KeyedStream中的max和maxBy有什么区别

2 个答案: