基本上我想做的是,基于tenantID创建GCS存储桶(作为事件的一部分),并使用FileIO.writeDynamic在Google数据流作业中使用动态文件命名来编写这些事件。
我面临的问题是
srcEvents.apply("Window", Window
.<MyEvent>into(FixedWindows.of(Duration.standardSeconds(60))))
.apply("WriteAvro", FileIO.<MyEventDestination, MyEvent>writeDynamic()
.by(groupFn).via(outputFn, sinkFn)
**.to()** // what to pass as here as i want it to be based on event.getTenantId (gs://test-123)
.withDestinationCoder(destinationCoder)
.withNumShards(100).withNaming(namingFn));
我通过调用PTranform of srcEvents
答案 0 :(得分:0)
我能够使用withTempDirectory选项来解决此问题,其中我提供了temp gcs存储桶路径,并使用文件命名来为每个域构建动态存储桶路径
srcEvents.apply("Window", Window .<MyEvent>into(FixedWindows.of(Duration.standardSeconds(60)))) .apply("WriteAvro", FileIO.<MyEventDestination, MyEvent>writeDynamic() .by(groupFn).via(outputFn, sinkFn) .withTempDirectory("gs://temp-blah/") .withDestinationCoder(destinationCoder) .withNumShards(100).withNaming(namingFn)); namingFn to build filename gs://domain-123/2020-05-01/event.avro