在云中执行我们的管道运行正常。但是当它作为DirectPipelineRunner
(即本地)运行时,它会borks,并抱怨所提供的文件模式。文件模式使用glob。
这是在本地运行时的预期行为吗?
[..]
TextIO.Read.from("gs://cdf-testing/NetworkClicks_123456_2015010[1-2]*")
[..]
Feb 18, 2015 4:19:09 PM com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner run
INFO: Executing pipeline using the DirectPipelineRunner.
Feb 18, 2015 4:19:10 PM com.google.cloud.dataflow.sdk.util.GcsUtil expand
INFO: matching files in bucket cdf-testing, prefix NetworkClicks_123456_2015010[1-2] against pattern NetworkClicks_123456_2015010[1-2][^/]*
Exception in thread "main" java.lang.RuntimeException: Failed to read from source: com.google.cloud.dataflow.sdk.runners.worker.TextReader@55dbc59b
at com.google.cloud.dataflow.sdk.util.ReaderUtils.readElemsFromReader(ReaderUtils.java:40)
at com.google.cloud.dataflow.sdk.io.TextIO.evaluateReadHelper(TextIO.java:702)
at com.google.cloud.dataflow.sdk.io.TextIO.access$000(TextIO.java:98)
at com.google.cloud.dataflow.sdk.io.TextIO$Read$Bound$1.evaluate(TextIO.java:310)
at com.google.cloud.dataflow.sdk.io.TextIO$Read$Bound$1.evaluate(TextIO.java:306)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.visitTransform(DirectPipelineRunner.java:611)
at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:200)
at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:196)
at com.google.cloud.dataflow.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:109)
at com.google.cloud.dataflow.sdk.Pipeline.traverseTopologically(Pipeline.java:204)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.run(DirectPipelineRunner.java:584)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:328)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:70)
at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:145)
at com.shinetech.tpc.engine.CDFEngine.loadClicks(CDFEngine.java:88)
at com.shinetech.tpc.engine.CDFEngine.doMagic(CDFEngine.java:75)
at com.shinetech.tpc.Main.main(Main.java:15)
Caused by: java.io.IOException: No match for file pattern 'gs://cdf-testing/NetworkClicks_123456_2015010[1-2]*'
at com.google.cloud.dataflow.sdk.runners.worker.FileBasedReader.iterator(FileBasedReader.java:101)
at com.google.cloud.dataflow.sdk.util.ReaderUtils.readElemsFromReader(ReaderUtils.java:35)
... 16 more
答案 0 :(得分:2)
不,两位参赛者应该表现得一样。听起来像是DirectRunner中的一个错误。感谢您提供报告 - 修复结束时将在此处回复。