Question

我创建了一个google dataflow作业，但是即使我已经导入了所需的变量，我也一直未定义全局名称'bigquery'。

这是我的导入列表：

from __future__ import absolute_import

import argparse
import logging
import ast
import json

import apache_beam as beam
from apache_beam.io import ReadFromText, WriteToText 
from apache_beam.options.pipeline_options import PipelineOptions
from apache_beam.options.pipeline_options import SetupOptions
from apache_beam.options.pipeline_options import StandardOptions
from google.cloud import bigquery

这是返回错误的类：

class CheckExistance(beam.DoFn):

    def __init__(self, table):
        self.table = table.replace(":", ".")

    def process(self, element):

        client = bigquery.Client()
        date = element['date'].split(" ")[0]

        query_job = client.query("""
        QUERY """ % (self.table, date))

        yield element

你们知道什么可能导致此错误吗？顺便说一句，我只有在将其部署到Google的数据流作业中时才会收到此错误，它在本地运行良好。

编辑：

我能够通过将导入位置更改为需要bigquery变量的函数来解决最初的问题，如下所示：

class CheckExistance(beam.DoFn):

    def __init__(self, table):
        self.table = table.replace(":", ".")

    def process(self, element):
        from google.cloud import bigquery
        client = bigquery.Client()
        date = element['date'].split(" ")[0]

        query_job = client.query("""
        QUERY""" % (self.table, date))

        yield element

但是现在我收到一个错误消息，说“客户端”没有属性查询，即使我在数据流作业上的软件包是最新的并且它在本地运行也没有任何问题。

错误消息：

AttributeError：“客户端”对象没有属性“查询”

Answer 1

我猜您需要启用BigQuery

资源/高级Google服务/启用BigQuery

编辑：检查注释以查看故障排除，发现和有效的方法。

未定义全局名称“ bigquery”

1 个答案: