关于BigQuery视图的神秘感

时间:2015-12-30 15:50:16

标签: sql views google-bigquery

这是我的谜。在控制台上,当我计算这个查询时,它运行得非常好:

SELECT rd.ds_id AS ds_id
FROM (SELECT ds_id, 1 AS dummy FROM bq_000010.table) rd
  INNER JOIN EACH (SELECT 1 AS dummy) cal ON (cal.dummy = rd.dummy);

然后我将其保存为名为dataset.myview的视图,然后运行:

SELECT * FROM dataset.myview LIMIT 1000

但这会引发以下错误:

  

引用非常量字段或使用聚合的SELECT查询   函数或具有WHERE,OMIT IF,GROUP BY,ORDER BY中的一个或多个   条款必须有FROM子句。

然而,当我尝试:SELECT * FROM dataset.myview时,即没有LIMIT,它就有效!!

事实上,当我使用底部的LIMIT运行我的完整查询时,它也会引发错误:

SELECT rd.ds_id AS ds_id
FROM (SELECT ds_id, 1 AS dummy FROM bq_000010.table) rd
  INNER JOIN EACH (SELECT 1 AS dummy) cal ON (cal.dummy = rd.dummy) LIMIT 1000;

然而,当我添加一个内部ORDER BY时,它再次计算得很好:

SELECT rd.ds_id AS ds_id
FROM (SELECT ds_id,
             1 AS dummy
      FROM bq_000010.000010_flux_visites_ds
      ORDER BY ds_id) rd
  INNER JOIN EACH (SELECT 1 AS dummy) cal ON (cal.dummy = rd.dummy) LIMIT 1000

1 个答案:

答案 0 :(得分:1)

如果您在视图上的选择中应用订单会怎样?或者你需要随机结果?

A query with a LIMIT clause may still be non-deterministic if there is no operator in the query that guarantees the ordering of the output result set. This is because BigQuery executes using a large number of parallel workers. The order in which parallel jobs return is not guaranteed.

我不确定为什么这里的订单会产生影响。然而,看到没有任何订单的限制通常很奇怪;这就是我询问订单的原因。一个完整的SWAG可能是并行工作者在内部选择完成之前完成外部连接和限制,从而导致内部错误;并且通过系统应用命令强制在执行内连接连接之前实现记录。

但我真的~~没有CLUE~