使用DirectRunner访问BigQuery时出错

时间:2017-05-06 15:28:26

标签: google-cloud-dataflow apache-beam

我尝试从我的管道访问BigQuery表,使用DataflowRunner时一切正常,但使用DirectRunner时出现以下错误。

WARNING:root:Dataset does not exist so we will create it
WARNING:root:Task failed: Traceback (most recent call last):
  File "local/lib/python2.7/site-packages/apache_beam/runners/direct/executor.py", line 300, in __call__
    result = evaluator.finish_bundle()
  File "local/lib/python2.7/site-packages/apache_beam/runners/direct/transform_evaluator.py", line 209, in finish_bundle
    bundles = _read_values_to_bundles(reader)
  File "local/lib/python2.7/site-packages/apache_beam/runners/direct/transform_evaluator.py", line 196, in _read_values_to_bundles
    read_result = [GlobalWindows.windowed_value(e) for e in reader]
  File "local/lib/python2.7/site-packages/apache_beam/io/gcp/bigquery.py", line 606, in __iter__
    yield self.client.convert_row_to_dict(row, schema)
  File "local/lib/python2.7/site-packages/apache_beam/io/gcp/bigquery.py", line 1073, in convert_row_to_dict
    for x in value]
TypeError: 'NoneType' object is not iterable

这是初始化跑步者的代码片段:

options = {
    'project':
        args.project_id,
}
pipeline_options = beam.pipeline.PipelineOptions(flags=[], **options)

更新:

以下是构建管道的相关部分的片段。

reader = (
    pipeline
    | 'ReadFromBigQuery %s' % query_path >> beam.io.Read(
        beam.io.BigQuerySource(query=query, use_standard_sql=True)))
readers.append(reader)
readers | 'FlattenPaths %s' % ':'.join(path_names) >> beam.Flatten()

0 个答案:

没有答案