sql:比较两个具有不同值的表会产生重复的结果

时间:2019-05-09 12:56:08

标签: sql google-bigquery

我想联接/合并两个具有主键,类别和分数的表,这样,结果将显示主键以及两个表中都存在的所有类别和分数,如果给定的话类别仅存在于一个表中,而第二个表中的得分为空。

这些表如下: 意见_1

fruit   category    score
apple   color   15
apple   sweet   50
apple   scent   35
orange  color   40
orange  sweet   60

观点_2

fruit   category    score
apple   color   28
apple   sweet   12
orange  color   29
orange  sweet   50
orange  scent   31

我尝试了使用Union进行完全外部联接和双左联接,但相同的结果却因类别错误地相乘而得到了

WITH opinion_1 AS (
  SELECT 'apple' as fruit, 'color' as category, 15 as score UNION ALL
  SELECT 'apple',   'sweet',    50 UNION ALL
  SELECT 'apple', 'scent',  35 UNION ALL
  SELECT 'orange', 'color', 40 UNION ALL
  SELECT 'orange', 'sweet', 60
), opinion_2 AS (
  SELECT 'apple' as fruit, 'color' as category, 28 as score UNION ALL
  SELECT 'apple',   'sweet',    12 UNION ALL
  SELECT 'orange', 'color', 29 UNION ALL
  SELECT 'orange', 'sweet', 50 UNION ALL
  SELECT 'orange', 'scent', 31
)
SELECT
  opinion_1.fruit,
  opinion_1.category as category,
  opinion_1.score as score1,
  opinion_2.score as score2
FROM opinion_1
full outer join opinion_2
on opinion_1.fruit = opinion_2.fruit

我希望得到以下操作结果:

fruit   category    score1  score2
apple   color   15  28
apple   sweet   50  12
apple   scent   35  null
orange  color   40  29
orange  sweet   60  50
orange  scent   null    31

但是我得到了:

fruit   category    score1  score2
apple   color   15  12
apple   color   15  28
apple   sweet   50  12
apple   sweet   50  28
apple   scent   35  12
apple   scent   35  28
orange  color   40  50
orange  color   40  31
orange  color   40  29
orange  sweet   60  50
orange  sweet   60  31
orange  sweet   60  29

2 个答案:

答案 0 :(得分:2)

我认为您缺少加入条件才能获得期望的结果。此外,如果在Opinion_1上没有某些水果的记录而在Opinion_2上也没有记录,则选择opinion_1.fruitopinion_1.category将产生空值。以下查询将产生预期的结果:

WITH opinion_1 AS (
  SELECT 'apple' as fruit, 'color' as category, 15 as score UNION ALL
  SELECT 'apple',   'sweet',    50 UNION ALL
  SELECT 'apple', 'scent',  35 UNION ALL
  SELECT 'orange', 'color', 40 UNION ALL
  SELECT 'orange', 'sweet', 60
), opinion_2 AS (
  SELECT 'apple' as fruit, 'color' as category, 28 as score UNION ALL
  SELECT 'apple',   'sweet',    12 UNION ALL
  SELECT 'orange', 'color', 29 UNION ALL
  SELECT 'orange', 'sweet', 50 UNION ALL
  SELECT 'orange', 'scent', 31
)
SELECT
  coalesce(opinion_1.fruit, opinion_2.fruit) as fruit,
  coalesce(opinion_1.category, opinion_2.category) as category,
  opinion_1.score as score1,
  opinion_2.score as score2
FROM opinion_1
full outer join opinion_2
on opinion_1.fruit = opinion_2.fruit and opinion_1.category = opinion_2.category

答案 1 :(得分:1)

以下是用于BigQuery标准SQL

#standardSQL
WITH opinion_1 AS (
  SELECT 'apple' AS fruit, 'color' AS category, 15 AS score UNION ALL
  SELECT 'apple',   'sweet',    50 UNION ALL
  SELECT 'apple', 'scent',  35 UNION ALL
  SELECT 'orange', 'color', 40 UNION ALL
  SELECT 'orange', 'sweet', 60
), opinion_2 AS (
  SELECT 'apple' AS fruit, 'color' AS category, 28 AS score UNION ALL
  SELECT 'apple',   'sweet',    12 UNION ALL
  SELECT 'orange', 'color', 29 UNION ALL
  SELECT 'orange', 'sweet', 50 UNION ALL
  SELECT 'orange', 'scent', 31
)
SELECT
  IFNULL(a.fruit, b.fruit) fruit,
  IFNULL(a.category, b.category) AS category,
  a.score AS score1,
  b.score AS score2
FROM opinion_1 a
FULL OUTER JOIN opinion_2 b
USING(fruit, category)   

有结果

Row fruit   category    score1  score2   
1   apple   color       15      28   
2   apple   sweet       50      12   
3   apple   scent       35      null     
4   orange  color       40      29   
5   orange  sweet       60      50   
6   orange  scent       null    31