显示属性超集中实例拥有哪些属性的查询

时间:2019-04-12 16:18:44

标签: sql google-bigquery report name-value

我在Bigquery中有一个关系数据集,其中包含两个表。

第一个表保存客户数据

mixedsort(sub("W(\\d{1})_", "W0\\1_", names(df)))
#[1] "W01_2019" "W02_2019" "W03_2019" "W10_2018" "W50_2018" "W51_2018" "W52_2018"

第二个表在第一个表中包含与客户相关的各种名称/值对:

+-------------+--------+
| Customer ID | Name   |
+-------------+--------+
| 1           | Bob    |
+-------------+--------+
| 2           | Jenny  |
+-------------+--------+
| 3           | Janice |
+-------------+--------+

我想生成一个枚举每个客户的报告,并在表2中找到的每个name:value下设置一个TRUE,例如:

+-------------+----------+-------+
| Customer ID | Category | Value |
+-------------+----------+-------+
| 1           | A        | A     |
+-------------+----------+-------+
| 1           | A        | B     |
+-------------+----------+-------+
| 1           | B        | A     |
+-------------+----------+-------+
| 2           | B        | B     |
+-------------+----------+-------+

我尝试将每个category:value组合指定为我的select语句中的列

+-------------+------+------+-----+------+------+
| Customer ID | A:A  | A:B  | A:C | B:A  | B:B  |
+-------------+------+------+-----+------+------+
| 1           | TRUE | TRUE |     | TRUE |      |
+-------------+------+------+-----+------+------+
| 2           |      |      |     |      | TRUE |
+-------------+------+------+-----+------+------+
| 3           |      |      |     |      |      |
+-------------+------+------+-----+------+------+

但这并没有给我任何好处,因为一旦找到该值,我就不知道如何获取将单元格设置为TRUE的查询。

很抱歉,我很不熟悉SQL。

2 个答案:

答案 0 :(得分:1)

您需要某种聚合,例如:

select t1.customer_id,
       bool_or(t2.category = 'a' and t2.value = 'a') as a_a,
       bool_or(t2.category = 'a' and t2.value = 'b') as a_b,
       bool_or(t2.category = 'a' and t2.value = 'c') as a_c,
       bool_or(t2.category = 'b' and t2.value = 'a') as b_a,
       bool_or(t2.category = 'b' and t2.value = 'b') as b_b
from table_1 t1 join
     table_2 t2 
     on t1.customer_id = t2.customer_id  
group by t1.customer_id;

答案 1 :(得分:1)

以下是用于BigQuery标准SQL

#standardSQL
SELECT customer_id,
  LOGICAL_OR((category, value) = ('A', 'A')) AS a_a,
  LOGICAL_OR((category, value) = ('A', 'B')) AS a_b,
  LOGICAL_OR((category, value) = ('A', 'C')) AS a_c,
  LOGICAL_OR((category, value) = ('B', 'A')) AS b_a,
  LOGICAL_OR((category, value) = ('B', 'B')) AS b_b
FROM `project.dataset.table1`  
JOIN `project.dataset.table2`
USING (customer_id)
GROUP BY customer_id   

您可以使用问题中的示例数据来测试,玩游戏,如下例所示

#standardSQL
WITH `project.dataset.table1` AS (
  SELECT 1 Customer_ID, 'Bob' Name UNION ALL
  SELECT 2, 'Jenny' UNION ALL
  SELECT 3, 'Janice' 
), `project.dataset.table2` AS (
  SELECT 1 Customer_ID, 'A' Category, 'A' Value UNION ALL
  SELECT 1, 'A', 'B' UNION ALL
  SELECT 1, 'B', 'A' UNION ALL
  SELECT 2, 'B', 'B' 
)
SELECT customer_id,
  LOGICAL_OR((category, value) = ('A', 'A')) AS a_a,
  LOGICAL_OR((category, value) = ('A', 'B')) AS a_b,
  LOGICAL_OR((category, value) = ('A', 'C')) AS a_c,
  LOGICAL_OR((category, value) = ('B', 'A')) AS b_a,
  LOGICAL_OR((category, value) = ('B', 'B')) AS b_b
FROM `project.dataset.table1`  
JOIN `project.dataset.table2`
USING (customer_id)
GROUP BY customer_id   

有结果

Row customer_id a_a     a_b     a_c     b_a     b_b  
1   1           true    true    false   true    false    
2   2           false   false   false   false   true      

如果您需要/想要的输出与问题完全相同,可以使用以下调整后的版本

#standardSQL
SELECT customer_id,
  IF(LOGICAL_OR((category, value) = ('A', 'A')), 'TRUE', '') AS a_a,
  IF(LOGICAL_OR((category, value) = ('A', 'B')), 'TRUE', '') AS a_b,
  IF(LOGICAL_OR((category, value) = ('A', 'C')), 'TRUE', '') AS a_c,
  IF(LOGICAL_OR((category, value) = ('B', 'A')), 'TRUE', '') AS b_a,
  IF(LOGICAL_OR((category, value) = ('B', 'B')), 'TRUE', '') AS b_b
FROM `project.dataset.table1`  
JOIN `project.dataset.table2`
USING (customer_id)
GROUP BY customer_id

有结果

Row customer_id a_a     a_b     a_c     b_a     b_b  
1   1           TRUE    TRUE            TRUE         
2   2                                           TRUE     

注意:在上面的示例中-您实际上不需要联接,因为您没有使用table1中的字段,而不是用作过滤器(仅显示table1中的用户)