我正在尝试写入 S3 接收器。
private static StreamingFileSink<String> createS3SinkFromStaticConfig(
final Map<String, Properties> applicationProperties
) {
Properties sinkProperties = applicationProperties.get(SINK_PROPERTIES);
String s3SinkPath = sinkProperties.getProperty(SINK_S3_PATH_KEY);
return StreamingFileSink
.forRowFormat(
new Path(s3SinkPath),
new SimpleStringEncoder<String>(StandardCharsets.UTF_8.toString())
)
.build();
}
以下代码有效,我可以在 S3 中看到结果
input.map(value -> { // Parse the JSON
JsonNode jsonNode = jsonParser.readValue(value, JsonNode.class);
return new Tuple2<>(jsonNode.get("ticker").asText(), jsonNode.get("price").asDouble());
}).returns(Types.TUPLE(Types.STRING, Types.DOUBLE))
.keyBy(0) // Logically partition the stream per stock symbol
.timeWindow(Time.seconds(10), Time.seconds(5)) // Sliding window definition
.min(1) // Calculate minimum price per stock over the window
.setParallelism(3) // Set parallelism for the min operator
.map(value -> value.f0 + ": ----- " + value.f1.toString() + "\n")
.addSink(createS3SinkFromStaticConfig(applicationProperties));
但以下内容不会向 S3 写入任何内容。
KeyedStream<EnrichedMetric, EnrichedMetricKey> input = env.addSource(new EnrichedMetricSource())
.assignTimestampsAndWatermarks(
WatermarkStrategy.<EnrichedMetric>forMonotonousTimestamps()
.withTimestampAssigner(((event, l) -> event.getEventTime()))
).keyBy(new EnrichedMetricKeySelector());
DataStream<String> statsStream = input
.window(TumblingEventTimeWindows.of(Time.seconds(5)))
.process(new PValueStatisticsWindowFunction());
statsStream.addSink(createS3SinkFromStaticConfig(applicationProperties));
PValueStatisticsWindowFunction
是一个 ProcessWindowFunction
,如下所示。
@Override
public void process(EnrichedMetricKey enrichedMetricKey,
Context context,
Iterable<EnrichedMetric> in,
Collector<String> out) throws Exception {
int count = 0;
for (EnrichedMetric m : in) {
count++;
}
out.collect("Count: " + count);
}
当我在本地运行 Flink 应用程序时,statsStream.print()
将结果打印到 log/flink-*-taskexecutor-*.out
。
在集群中,我可以看到检查点已启用以及 Flink 仪表板中的各种检查点历史记录。我还确保 S3 路径的格式为 s3a://<bucket>
不确定我在这里遗漏了什么。