我一直在研究如何刷新数据流作业中的辅助输入而不必重新启动管道。我找到了这个示例https://github.com/spotify/scio/issues/865,但似乎无法使其在本地工作或部署。这是我的代码示例:
final PCollectionView<Map<String, String>> versionCoreMapping =
pipeline.apply("Read from spanner for active versions for different cores",
SpannerIO.read()
.withSpannerConfig(spannerConfig)
.withInstanceId(options.getInstanceId())
.withDatabaseId(options.getDatabaseId())
.withQuery("SELECT version, sys, action FROM table"))
.apply("Window Core Version Map",
Window.into(FixedWindows.of(Duration.standardSeconds(5))))
.apply("Core Version Map into Global Window",
Window.into(new GlobalWindows()))
.apply("Refresh Cache for Versioning",
ParDo.of(new RefreshCache("sys", "version", "action")))
.apply("convert to version table to map", View.asMap());
RefreshCache
public class RefreshCache extends DoFn<Struct, KV<String, String>> {
private static final long serialVersionUID = 1;
private String key;
private String value;
private String secondaryPrimaryKey;
private static final Logger LOGGER = LoggerFactory.getLogger(RefreshCache.class);
public RefreshCache(String key, String value, String secondaryPrimaryKey) {
this.key = key;
this.value = value;
this.secondaryPrimaryKey = secondaryPrimaryKey;
}
@ProcessElement
public void processElement(ProcessContext c) {
try {
LOGGER.info("Refreshing cache..." + key);
Struct row = c.element();
KV<String, String> returnMap;
if(null != secondaryPrimaryKey) {
returnMap = KV.of(row.getString(key).concat(",").concat(row.getString(secondaryPrimaryKey)), row.getString(value));
} else {
returnMap = KV.of(row.getString(key), row.getString(value));
}
c.output(returnMap);
}catch (Exception e) {
LOGGER.error("Exception: " + e.getMessage());
}
}
}
运行此命令时,我看不到RefreshCache内部的日志每隔5秒重复一次。
任何对此的帮助都会很棒
谢谢
克雷格