如何基于单个记录实现多条记录输出

时间:2019-03-13 11:31:36

标签: sql google-bigquery

我有一个表“ Managers”,其中包含如下数据

enter image description here

我期望输出如下所示

enter image description here

或我期望的另一种输出格式 enter image description here

条件是 经理1001已于2018年加入,结束日期为9999,因此他活跃于2018、2019和2020

经理1004于2018年加入公司,并于同年离开公司,因此他仅在2018年活跃。

请帮助我实现该目标

2 个答案:

答案 0 :(得分:1)

建立年份列表并JOIN

SELECT manager_id, yearnum, 'Active' AS status
FROM UNNEST(GENERATE_ARRAY(2018, 2020)) AS yearnum
JOIN managers ON yearnum BETWEEN EXTRACT(year FROM eff_start_date)
                             AND EXTRACT(year FROM eff_end_date)

答案 1 :(得分:1)

以下是用于BigQuery标准SQL

#standardSQL
WITH years AS (
  SELECT EXTRACT(YEAR FROM year) year 
  FROM ( SELECT 
    (SELECT MIN(eff_start_date) FROM `project.dataset.managers`) AS min_date, 
    (SELECT MAX(eff_end_date) FROM `project.dataset.managers` WHERE eff_end_date != '9999-12-31') max_date  
  ), UNNEST(GENERATE_DATE_ARRAY(DATE_TRUNC(min_date, YEAR), DATE_TRUNC(max_date, YEAR), INTERVAL 1 YEAR)) year
), managers_list AS (
  SELECT manager_id, status, EXTRACT(YEAR FROM eff_start_date) start_year, EXTRACT(YEAR FROM eff_end_date) end_year
  FROM `project.dataset.managers`
)
SELECT manager_id, year, status 
FROM years y, managers_list m
WHERE year BETWEEN start_year AND end_year 

您可以使用问题中的示例数据来测试,玩游戏,如下例所示

#standardSQL
WITH `project.dataset.managers` AS (
  SELECT 1001 manager_id, 'Active' status, DATE '2018-02-10' eff_start_date, DATE '9999-12-31' eff_end_date UNION ALL
  SELECT 1002, 'Active', '2018-02-14', '2020-12-31' UNION ALL
  SELECT 1003, 'Active', '2018-02-16', '2019-02-15' UNION ALL
  SELECT 1004, 'Active', '2018-02-16', '2018-12-31' 
), years AS (
  SELECT EXTRACT(YEAR FROM year) year 
  FROM ( SELECT 
    (SELECT MIN(eff_start_date) FROM `project.dataset.managers`) AS min_date, 
    (SELECT MAX(eff_end_date) FROM `project.dataset.managers` WHERE eff_end_date != '9999-12-31') max_date  
  ), UNNEST(GENERATE_DATE_ARRAY(DATE_TRUNC(min_date, YEAR), DATE_TRUNC(max_date, YEAR), INTERVAL 1 YEAR)) year
), managers_list AS (
  SELECT manager_id, status, EXTRACT(YEAR FROM eff_start_date) start_year, EXTRACT(YEAR FROM eff_end_date) end_year
  FROM `project.dataset.managers`
)
SELECT manager_id, year, status 
FROM years y, managers_list m
WHERE year BETWEEN start_year AND end_year 
-- ORDER BY manager_id, year   

有结果

Row manager_id  year    status   
1   1001        2018    Active   
2   1001        2019    Active   
3   1001        2020    Active   
4   1002        2018    Active   
5   1002        2019    Active   
6   1002        2020    Active   
7   1003        2018    Active   
8   1003        2019    Active   
9   1004        2018    Active