我有一个sql Employee
表,用于描述用户对特定金属的喜爱程度,表格如下所示
"Employee_number" "Rank_1 "Rank_2 "Rank_3 "Rank_4 "Rank5
1 Gold null null null null
2 bronze Gold null null null
3 Gold platinum null null null
4 Gold copper null null null
5 Gold bronze platinum null null
6 Gold bronze platinum null null
7 Gold platinum Silver null null
8 Gold platinum Silver null null
9 Gold platinum business null null
10 null null null null null
11 Silver bronze business platinum Gold
Employee_number字段是唯一字段
还有一个表格描述了金属的总体排名,它看起来像这样:
Metal Rank
Gold 1
platinum 2
silver 3
copper 4
bronze 5
我想要做的是,只要员工有空值,就根据他们的排名填写默认金属
例如 - >对于员工10:所有值都为空,简单,他的rank_1金属将是金,rank2_metal将是白金,rank3_metal将是白银,排名4_metal将是铜,排名5_metal将是铜
现在对于employee_1,他已经拥有rank_1金属,但没有其他级别可供使用,所以用白金替换rank2_metal,用银替换rank_3金属,用铜代替rank_4金属,用铜代替rank_5金属
现在对于employee_2,他有铜牌作为他的第一金属,金牌作为第二金属,他的rank3_metal将是白金,rank_4金属将是银,rank5_metal将是铜
类似的,让我们以employee_6的情况填充他的三个等级,需要填写等级4和5,他的rank_4金属将是银,rank_5金属将是铜有没有人对sql如何成为一个有任何建议,我正在使用bigquery
答案 0 :(得分:3)
下面是BigQuery Standard SQL - 希望您将这个用于实际用例。
#standardSQL
WITH metals AS (
SELECT 'Gold' Metal, 1 RANK UNION ALL SELECT 'platinum', 2 UNION ALL
SELECT 'silver', 3 UNION ALL SELECT 'copper', 4 UNION ALL SELECT 'bronze', 5
)
SELECT Employee_number,
MAX(IF(pos=0, Metal, NULL)) Rank_1,
MAX(IF(pos=1, Metal, NULL)) Rank_2,
MAX(IF(pos=2, Metal, NULL)) Rank_3,
MAX(IF(pos=3, Metal, NULL)) Rank_4,
MAX(IF(pos=4, Metal, NULL)) Rank_5
FROM (
SELECT Employee_number,
ARRAY_CONCAT(
ARRAY(SELECT Metal FROM (
SELECT 1 a, Rank_1 Metal UNION ALL SELECT 2, Rank_2 UNION ALL
SELECT 3, Rank_3 UNION ALL SELECT 4, Rank_4 UNION ALL
SELECT 5, Rank_5 )
WHERE NOT Metal IS NULL
ORDER BY a
), ARRAY(SELECT Metal FROM metals m
WHERE NOT LOWER(Metal) IN (
SELECT x FROM UNNEST(ARRAY(
SELECT LOWER(b) FROM (
SELECT Rank_1 b UNION ALL SELECT Rank_2 UNION ALL
SELECT Rank_3 UNION ALL SELECT Rank_4 UNION ALL
SELECT Rank_5 )
WHERE NOT b IS NULL
)) x
) ORDER BY RANK
)) arr
FROM `project.dataset.employee`
), UNNEST(arr) Metal WITH OFFSET pos
GROUP BY Employee_number
ORDER BY Employee_number
您可以使用您问题中的虚拟数据进行测试,使用以上
#standardSQL
WITH `project.dataset.employee` AS (
SELECT 1 Employee_number, 'Gold' Rank_1, NULL Rank_2, NULL Rank_3, NULL Rank_4, NULL Rank_5 UNION ALL
SELECT 2, 'bronze', 'Gold', NULL, NULL, NULL UNION ALL
SELECT 3, 'Gold', 'platinum', NULL, NULL, NULL UNION ALL
SELECT 4, 'Gold', 'copper', NULL, NULL, NULL UNION ALL
SELECT 5, 'Gold', 'bronze', 'platinum', NULL, NULL UNION ALL
SELECT 6, 'Gold', 'bronze', 'platinum', NULL, NULL UNION ALL
SELECT 7, 'Gold', 'platinum', 'Silver', NULL, NULL UNION ALL
SELECT 8, 'Gold', 'platinum', 'Silver', NULL, NULL UNION ALL
SELECT 9, 'Gold', 'platinum', 'business', NULL, NULL UNION ALL
SELECT 10, NULL, NULL, NULL, NULL, NULL UNION ALL
SELECT 11, 'Silver', 'bronze', 'business', 'platinum', 'Gold'
), metals AS (
SELECT 'Gold' Metal, 1 RANK UNION ALL SELECT 'platinum', 2 UNION ALL
SELECT 'silver', 3 UNION ALL SELECT 'copper', 4 UNION ALL SELECT 'bronze', 5
)
SELECT Employee_number,
MAX(IF(pos=0, Metal, NULL)) Rank_1,
MAX(IF(pos=1, Metal, NULL)) Rank_2,
MAX(IF(pos=2, Metal, NULL)) Rank_3,
MAX(IF(pos=3, Metal, NULL)) Rank_4,
MAX(IF(pos=4, Metal, NULL)) Rank_5
FROM (
SELECT Employee_number,
ARRAY_CONCAT(
ARRAY(SELECT Metal FROM (
SELECT 1 a, Rank_1 Metal UNION ALL SELECT 2, Rank_2 UNION ALL
SELECT 3, Rank_3 UNION ALL SELECT 4, Rank_4 UNION ALL
SELECT 5, Rank_5 )
WHERE NOT Metal IS NULL
ORDER BY a
), ARRAY(SELECT Metal FROM metals m
WHERE NOT LOWER(Metal) IN (
SELECT x FROM UNNEST(ARRAY(
SELECT LOWER(b) FROM (
SELECT Rank_1 b UNION ALL SELECT Rank_2 UNION ALL
SELECT Rank_3 UNION ALL SELECT Rank_4 UNION ALL
SELECT Rank_5 )
WHERE NOT b IS NULL
)) x
) ORDER BY RANK
)) arr
FROM `project.dataset.employee`
), UNNEST(arr) Metal WITH OFFSET pos
GROUP BY Employee_number
ORDER BY Employee_number
结果
Row Employee_number Rank_1 Rank_2 Rank_3 Rank_4 Rank_5
1 1 Gold platinum silver copper bronze
2 2 bronze Gold platinum silver copper
3 3 Gold platinum silver copper bronze
4 4 Gold copper platinum silver bronze
5 5 Gold bronze platinum silver copper
6 6 Gold bronze platinum silver copper
7 7 Gold platinum Silver copper bronze
8 8 Gold platinum Silver copper bronze
9 9 Gold platinum business silver copper
10 10 Gold platinum silver copper bronze
11 11 Silver bronze business platinum Gold
注意:上面的解决方案假设fill和NULL Metal之间没有混合,这意味着有三个选项:
1. all Rank fields filled already with Metal
2. all Rank fields are NULL
3. first 1 or more fields filled with Metal and rest are NULLs
说到这里,第一个数组是由填充的字段构成的;第二个数组是从Metals表中的其余Metal字段构建的;然后连接两个数组,前5个元素用于重新创建原始表
希望这不是太乱了
P.S。上面的解决方案可以相对容易地扩展到NULL和填充金属混合的情况 - 但看起来这是不可能的范围:o)