编写张量流变换元数据时,管道将在GCP上失败

时间:2018-03-29 00:16:20

标签: google-cloud-platform google-cloud-dataflow apache-beam dataflow tensorflow-transform

我希望有人能提供帮助。我一直在谷歌搜索这个错误,但没有找到任何东西。

我有一个在本地执行时完美运行但在GCP上执行时失败的管道。以下是我收到的错误消息。

  

工作流程失败。原因:S03:写入变换   FN /写入元数据/ ResolveBeamFutures / CreateSingleton /读+写   transform fn / WriteMetadata / ResolveBeamFutures / ResolveFutures / Do + Write   转换fn / WriteMetadata / WriteMetadata失败。,一个工作项是   尝试4次没有成功。每次工人最终   与服务失去联系。尝试了以下工作项目:

     

回溯(最近一次调用最后一次):文件" preprocess.py",第491行,   在       main()文件" preprocess.py",第487行,在main中       transform_data(args,pipeline_options,runner)文件" preprocess.py",第451行,在transform_data中       eval_data | ='身份评估' >> beam.ParDo(Identity())File" /Library/Python/2.7/site-packages/apache_beam/pipeline.py" ;,第335行,   在退出       self.run()。wait_until_finish()File" /Library/Python/2.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",   第897行,在wait_until_finish中       (self.state,getattr(self._runner,' last_error_msg',None)),self)apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException:   数据流管道失败。状态:FAILED,错误:回溯(最近   最后打电话):文件   " /usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py" ;,   第582行,在do_work中       work_executor.execute()File" /usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py",   第166行,执行中       op.start()文件" apache_beam / runners / worker / operations.py",第294行,在apache_beam.runners.worker.operations.DoOperation.start中   (apache_beam /跑步/工人/ operations.c:10607)       def start(self):文件" apache_beam / runners / worker / operations.py",第295行,in   apache_beam.runners.worker.operations.DoOperation.start   (apache_beam /跑步/工人/ operations.c:10501)       with self.scoped_start_state:File" apache_beam / runners / worker / operations.py",line 300,in   apache_beam.runners.worker.operations.DoOperation.start   (apache_beam /跑步/工人/ operations.c:9702)       pickler.loads(self.spec.serialized_fn))File" /usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py",   第225行,在负载中       return dill.loads(s)File" /usr/local/lib/python2.7/dist-packages/dill/dill.py" ;,第277行,in   负载       return load(file)File" /usr/local/lib/python2.7/dist-packages/dill/dill.py" ;,第266行,in   加载       obj = pik.load()文件" /usr/lib/python2.7/pickle.py" ;,第858行,在加载中       dispatchkey文件" /usr/lib/python2.7/pickle.py",第1083行,在load_newobj中       obj = cls。 new (cls,* args)TypeError: new ()需要4个参数(给定1个)

任何想法??

谢谢,

佩德罗

1 个答案:

答案 0 :(得分:0)

如果管道在本地工作但在GCP上失败,则可能是您遇到版本不匹配。

你在本地和GCP上运行什么TF,tf.Transform,梁版本?