还有其他方法可以将计算出的信息添加到表中吗?

时间:2019-09-25 15:51:22

标签: sql google-bigquery

我在BigQuery上有一个表,其中包含一些信息,并且需要创建另一个表,该表具有按名称汇总的信息,这些列的值具有某些条件。

以下是表格的示例:

CREATE TABLE EMP (
ID INT,
NAME CHAR,
ORDER_ID INT,
VALUE INT
);

INSERT INTO EMP VALUES (7369,'SMITH',1,5);
INSERT INTO EMP VALUES (7499,'ALLEN',2,10);
INSERT INTO EMP VALUES (7521,'JONES',3,15);
INSERT INTO EMP VALUES (7566,'JONES',4,5);
INSERT INTO EMP VALUES (7568,'JONES',5,10);

这里是按名称的简单汇总:

SELECT name as client_name, min(order_id) as f_order,
max(order_id) as l_order, sum(VALUE) as total_order_value
FROM emp
GROUP BY name

输出:

client_name|f_order|l_order|total_order_value    
ALLEN      |2      |2      |10
JONES      |3      |5      |30
SMITH      |1      |1      |5

并且当f_order = order_id时,需要再添加一列“ f_order_value”,该列的值来自“ VALUE”列

client_name|f_order|l_order|total_order_value|f_order_value    
ALLEN      |2      |2      |10               |10
JONES      |3      |5      |30               |15
SMITH      |1      |1      |5                |5

因此请尝试创建虚拟表并使用它,但是由于我没有在代码中使用任何聚集,因此它不起作用,而且我也不完全了解如何使用虚拟表:

with first_table as (SELECT name as client_name,
min(order_id) as f_order, max(order_id) as l_order,
sum(VALUE) as total_order_value
FROM emp
GROUP BY name)
select first_table.*, IF(f.f_order=e.order_id, o.VALUE,null) as
order_value from first_table f
join EMP e on f.client_name=e.name group by name

错误:

  

星展开表达式引用的列站点既未分组也未聚集

3 个答案:

答案 0 :(得分:2)

如果您使用的是MySQL 8+,则一种选择是在CTE中使用MIN分析函数,然后进行子查询:

WITH cte AS (
    SELECT *, MIN(order_id) OVER (PARTITION BY name) min_order_id
    FROM emp
)

SELECT
    name,
    MIN(order_id) AS f_order,
    MAX(order_id) AS l_order,
    SUM(VALUE) AS total_order_value,
    SUM(CASE WHEN order_id = min_order_id THEN VALUE ELSE 0 END) AS f_order_value
FROM cte
GROUP BY
    name;

enter image description here

Demo

添加到当前VALUE查询中的每个人有一个GROUP BY的条件和是基本解决方案。但是这里的困难在于,在汇总之前,我们需要知道每个人的最低order_id是多少。我找不到不先扫描表格一次的方法(因此,上面使用的CTE)。

答案 1 :(得分:1)

如果您要从BigQuery中的最小订单ID中获取值,我建议:

SELECT name as client_name, MIN(order_id) as f_order,
       MAX(order_id) as l_order,
       SUM(VALUE) as total_order_value,
       ARRAY_AGG(value ORDER BY order_id LIMIT 1)[SAFE_ORDINAL(1)] as min_order_value
FROM emp
GROUP BY name;

BigQuery不直接支持“第一”聚合函数(尽管有一个first_value()窗口函数)。但是,通常使用array_agg()方法。

不需要子查询,CTE或JOIN

答案 2 :(得分:1)

我愿意(对于BigQuery Standard SQL)

#standardSQL
SELECT name AS client_name, 
  ARRAY_AGG(STRUCT(order_id AS f_order, value AS f_order_value ) ORDER BY order_id LIMIT 1)[OFFSET(0)].*,
  MAX(order_id) AS l_order, 
  SUM(VALUE) AS total_order_value
FROM `project.dataset.emp`
GROUP BY name

如果要应用于您的问题的样本数据-结果为

Row client_name f_order f_order_value   l_order total_order_value    
1   ALLEN       2       10              2       10   
2   JONES       3       15              5       30   
3   SMITH       1       5               1       5