从GCS加载失败,并显示“此表的表更新操作过多”

时间:2019-07-13 21:36:25

标签: google-bigquery

当我尝试从GCS文件加载到BigQuery表中时,它因此错误而失败(通过使用python的方式):

Forbidden: 403 Exceeded rate limits: too many table update operations for this table. For more information, see https://cloud.google.com/bigquery/troubleshooting-errors

每张表要从GCS加载大约10个文件,但是当我在一天内运行3次时,我看到了上面的错误。

我也检查了此页面,但是我仍然不知道发生了什么: https://cloud.google.com/bigquery/quotas#standard_tables

为了提供更多细节,这是python的一部分:


    job_config = bigquery.LoadJobConfig()
    job_config.schema = SCHEMA
    job_config.source_format = bigquery.SourceFormat.NEWLINE_DELIMITED_JSON
    job_config.write_disposition = 'WRITE_APPEND'

    # This for loop runs about 10 times for a table_ref,
    # namely there are about 10 territory in territories
    load_jobs = []
    for territory in territories:
        gsc_uri = f"gs://my-bucket/path/to/file_{date}_{territory}.txt"
        load_job = bigquery_client.load_table_from_uri(
            gcs_uri, table_ref, job_config=job_config
        )
        load_job.territory = territory
        load_jobs.append(load_job)
        print(f"Starting job {territory} {load_job.job_id}")

    for load_job in loadjobs:
        load_job.result()
        print(f"Job finished {load_job.territory}.")

谢谢!

1 个答案:

答案 0 :(得分:0)

仍然不清楚为什么我达到了速率限制,但是@Elliott Brossard的建议对我的情况有所帮助。

所以不要这样做:

for territory in territories:
    gsc_uri = f"gs://my-bucket/path/to/file_{date}_{territory}.txt"

我能够做到这一点:

gsc_uri = f"gs://my-bucket/path/to/file_{date}_*.txt"

不仅解决了速率限制问题,而且还加快了加载速度!