在不指定所有列的情况下向SQL查询添加额外的行

时间:2018-07-26 19:59:12

标签: sql google-bigquery

我有一个“客户”表,其中包含约30列数据。这来自第三方数据源,因此我无法控制列数或插入表的能力。

Select * from customers 

我想在查询中做的是获得完整的客户列表,外加一个行,我可以在其中将交易链接到“未知”客户。

在额外的行上,我只想定义几列,其余的保留为空白。

SELECT -1 customer_id, 'No Customer' customer_name FROM DUAL

所以从本质上讲,我需要客户表中的所有记录,以及一条额外的“虚拟”记录,其中对于customer_id和customer_name,所有列均为空

这可以通过UNION完成,但是需要您将所有未使用的列显式定义为NULL。这是一个问题,因为第三方将来可能会添加其他列,这会破坏查询。

是否可以在不显式定义ALL列的情况下向查询(行的结合)添加额外的行,而仅将我不希望的列定义为NULL?

示例:

<h1>
Existing Customer table:
</h1>

<table class="tableizer-table">
<thead><tr class="tableizer-firstrow"><th>Customer_id</th><th>customer_name</th><th>customer_city</th><th>customer_industry</th></tr></thead><tbody>
 <tr><td>5453</td><td>Apple Inc.</td><td>Cupertino</td><td>Technology</td></tr>
 <tr><td>7865</td><td>Union Pacific</td><td>Omaha</td><td>Shipping</td></tr>
</tbody></table>

<h1>With extra data</h1>
<table class="tableizer-table">
<thead><tr class="tableizer-firstrow"><th>Customer_id</th><th>customer_name</th><th>customer_city</th><th>customer_industry</th></tr></thead><tbody>
 <tr><td>5453</td><td>Apple Inc.</td><td>Cupertino</td><td>Technology</td></tr>
 <tr><td>7865</td><td>Union Pacific</td><td>Omaha</td><td>Shipping</td></tr>
 <tr><td>-1</td><td>Unknown Customer</td><td>[NULL]</td><td>[NULL]</td></tr>
</tbody></table>

2 个答案:

答案 0 :(得分:4)

假设customer_idcustomer_name是前两列,您可以执行以下操作:

select x.customer_id, x.customer_name, c.* except (customer_id, customer_name)
from (select -1 as customer_id, 'No Customer' as customer_name) x left join
     customer c
     on 1 = 0;

except功能在BigQuery中非常有用。

答案 1 :(得分:4)

Gordon的解决方案很好[与往常一样:o)],但它仍然有缺点-它在很大程度上取决于customer表中列的顺序。这意味着,如果customer_id和customer_name不是customer表中的前两列,那么如果您将其与原始查询合并,则解决方案将失败。

即,下面(作为示例)将失败或将列放错位置(如果类型匹配):

#standardSQL
WITH `project.dataset.customer` AS (
  SELECT 'many other columns here' other, 1 customer_id, 'abc' customer_name 
)
SELECT x.customer_id, x.customer_name, c.* EXCEPT (customer_id, customer_name)
FROM (SELECT -1 AS customer_id, 'No Customer' AS customer_name) x 
LEFT JOIN `project.dataset.customer` c
ON 1 = 0 
UNION ALL SELECT * FROM `project.dataset.customer`   

要解决此问题,您应该在BigQuery功能中使用另一个非常有用的功能-

SELECT * REPLACE()    

因此,下面是解决上述问题的解决方案的示例(BigQuery标准SQL)

#standardSQL
WITH `project.dataset.customer` AS (
  SELECT 'many other columns here' other, 1 customer_id, 'abc' customer_name 
)
SELECT c.* REPLACE(-1 AS customer_id, 'No Customer' AS customer_name)
FROM (SELECT 1) x 
LEFT JOIN `project.dataset.customer` c
ON 1 = 0 
UNION ALL
SELECT * FROM `project.dataset.customer`  

如您所见,以上内容不依赖于列顺序,并且始终以customer表的原始架构返回结果