ImportError:没有名为options.value_provider的模块

时间:2017-05-20 18:30:32

标签: google-cloud-dataflow apache-beam

以下管道与DirectRunner一起使用,但下面使用DataflowRunner引发异常。 我该如何调试这些错误?这似乎对我来说非常不透明。

p = beam.Pipeline("DataflowRunner", argv=[
    '--project', project,
    '--staging_location', staging_location,
    '--temp_location', temp_location,
    '--output', output_gcs
])  
(p  
 | 'read events' >> beam.io.Read(beam.io.BigQuerySource(query=query, use_standard_sql=True))
 | 'write' >> beam.io.WriteToText(output_gcs)
)   
p.run().wait_until_finish()

加注

File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 578, in do_work
    work_executor.execute()
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 165, in execute
    op.start()
  File "dataflow_worker/operations.py", line 350, in dataflow_worker.operations.DoOperation.start (dataflow_worker/operations.c:13064)
    def start(self):
  File "dataflow_worker/operations.py", line 351, in dataflow_worker.operations.DoOperation.start (dataflow_worker/operations.c:12958)
    with self.scoped_start_state:
  File "dataflow_worker/operations.py", line 356, in dataflow_worker.operations.DoOperation.start (dataflow_worker/operations.c:12159)
    pickler.loads(self.spec.serialized_fn))
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py", line 212, in loads
    return dill.loads(s)
  File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 277, in loads
    return load(file)
  File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 266, in load
    obj = pik.load()
  File "/usr/lib/python2.7/pickle.py", line 858, in load
    dispatch[key](self)
  File "/usr/lib/python2.7/pickle.py", line 1090, in load_global
    klass = self.find_class(module, name)
  File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 423, in find_class
    return StockUnpickler.find_class(self, module, name)
  File "/usr/lib/python2.7/pickle.py", line 1124, in find_class
    __import__(module)
ImportError: No module named options.value_provider

2 个答案:

答案 0 :(得分:1)

value_provider是最近为了处理python SDK中的模板而引入的模块。但是,我在代码段中看不到任何模板,因此可能是包不匹配。您是否使用SDK和工作者的匹配版本?您可以检查工作者启动日志以检查已安装的软件包的版本。

答案 1 :(得分:1)

这里的问题相同。正如Maria指出的那样,这是apache_beam和google-cloud-dataflow软件包之间的不匹配问题。

为了说清楚,以下命令解决了它:

pip2 install --upgrade apache_beam google-cloud-dataflow