使用BigQuery交叉乘以具有相同模式的两个表

时间:2017-09-25 12:29:06

标签: sql google-bigquery

There are two tables(Table A and Table B) with same schema

我想将TableA.ColumnB * TableB.ColumnB等等相乘。我可以通过连接两个表来实现,然后通过将每个列相乘,如下所示

select TableA,
      (TableA.ColumnB*TableB.ColumnB) as Column B,
      (TableA.ColumnC*TableB.ColumnC) as Column C
from Table A
  join Table B on TableA.ColumnA = TableB.ColumnA

由于有许多列需要相乘,我正在寻找一些使用Big Query的简单方法。像TableA * TableB这样的东西(这样两个表上的相同列都会成倍增加)

由于

1 个答案:

答案 0 :(得分:1)

  

有没有像表A *表B这样简单的方法?   不是我对BigQuery的了解

我的建议是构建实用程序查询,为您创建适当的查询,包含所有100列 当然,这可以通过任何工具轻松完成,但如果你想留在BigQuery中 - 下面是BigQuery Standard SQL的选项

   

它构建了您需要登记所有

的查询部分

(TableA.ColumnX * TableB.ColumnXB)为X列,

#standardSQL
SELECT 
  CONCAT(
    'SELECT a.ColumnA AS ColumnA, \n',
    STRING_AGG(CONCAT(
      '\ta.', SPLIT(kv_a, ':')[SAFE_OFFSET(0)], ' * ', 
      'b.', SPLIT(kv_b, ':')[SAFE_OFFSET(0)], 
      ' AS ', SPLIT(kv_a, ':')[SAFE_OFFSET(0)]), 
      ' , \n'), ' \n',
    'FROM TableA a JOIN TableB b ON a.ColumnA = b.ColumnA'
  ) AS query_string  
FROM (
  SELECT 
    1 AS grp,
    SPLIT(REGEXP_REPLACE(TO_JSON_STRING(a), '["{}]', '')) AS kvs_a,
    SPLIT(REGEXP_REPLACE(TO_JSON_STRING(b), '["{}]', '')) AS kvs_b
  FROM (SELECT * FROM TableA LIMIT 1) a
  JOIN TableB b
  ON a.ColumnA = b.ColumnA
  LIMIT 1
) 
CROSS JOIN UNNEST(kvs_a) kv_a WITH OFFSET pos_a
CROSS JOIN UNNEST(kvs_b) kv_b WITH OFFSET pos_b
WHERE pos_a = pos_b AND pos_a > 0
GROUP BY grp   

如果你的环境"尊重\ n和\ t - 您将得到以下结果(假设您的问题中的表格中有三列 - 但它对100列的工作方式完全相同)

query_string
------------
SELECT a.ColumnA AS ColumnA,   
  a.ColumnB * b.ColumnB AS ColumnB ,   
  a.ColumnC * b.ColumnC AS ColumnC   
FROM TableA a JOIN TableB b ON a.ColumnA = b.ColumnA    

因此,现在您可以复制实用程序查询的结果并将其作为最终查询运行

您可能已经注意到 - 此方法基于列位置,但如果两个表中的列名相同 - 您可以通过删除连接来简化实用程序查询 - 如下所示

#standardSQL
SELECT 
  CONCAT(
    'SELECT a.ColumnA AS ColumnA, \n',
    STRING_AGG(CONCAT(
      '\ta.', SPLIT(kv_a, ':')[SAFE_OFFSET(0)], ' * ', 
      'b.', SPLIT(kv_a, ':')[SAFE_OFFSET(0)], 
      ' AS ', SPLIT(kv_a, ':')[SAFE_OFFSET(0)]), 
      ' , \n'), ' \n',
    'FROM TableA a JOIN TableB b ON a.ColumnA = b.ColumnA'
  ) AS query_string  
FROM (
  SELECT 
    1 AS grp,
    SPLIT(REGEXP_REPLACE(TO_JSON_STRING(a), '["{}]', '')) AS kvs_a
  FROM (SELECT * FROM TableA LIMIT 1) a
) 
CROSS JOIN UNNEST(kvs_a) kv_a WITH OFFSET pos_a
WHERE pos_a > 0
GROUP BY grp