Question

我正在使用Bigquery Java API通过WriteDisposition WRITE_APPEND同时运行约1000个复制作业（使用scala.concurrent.Future），但是我得到了

com.google.cloud.bigquery.BigQueryException: API limit exceeded: Unable to return a row that exceeds the API limits. To retrieve the row, export the table

我认为这是由太多的并发引起的，然后我尝试使用Monix的Task将并行度限制为最多20：

def execute(queries: List[Query]): Future[Seq[Boolean]] = {
    val tasks: Iterator[Task[List[Boolean]]] = queries.map(q => BqApi.copyTable(q, destinationTable))
      .sliding(20, 20)
      .map(Task.gather(_))

    val results: Task[List[Boolean]] = Task.sequence(tasks)
      .map(_.flatten.toList)

    results.runAsync
  }

其中BqApi.copyTable执行查询并将结果复制到目标表，然后返回Task [Boolean]。

同样的异常仍然发生。

但是，如果我将WriteDisposition更改为WRITE_TRUNCATE，则异常消失。

有人可以帮助我了解幕后发生的事情吗？为什么Bigquery API的行为如此？

Answer 1

当查询超过最大响应大小时，会遇到此消息。由于复制作业使用jobs.insert，因此您可能正在达到查询作业限制中的maximum row size。我建议在其issue tracker上填充一个BigQuery错误，以正确描述有关Java API的行为。

BigQueryException：超出了API限制

1 个答案: