跨多个表的BigQuery DML COUNT()

时间:2018-10-03 13:35:01

标签: google-bigquery dml

我正在寻找一种机制来控制每天在多个BigQuery表上导入的数据的准确性。每个表具有类似的格式,带有DATE和ID列。表格格式如下:

Table_1
| DATE       | ID       |
| 2018-10-01 | A        |
| 2018-10-01 | B        |
| 2018-10-02 | A        |
| 2018-10-02 | B        |
| 2018-10-02 | C        |

我要控制的是通过这种输出表来实现ID数量的演变:

CONTROL_TABLE
| DATE       | COUNT(Table1.ID) | COUNT(Table2.ID) | COUNT(Table3.ID) |
| 2018-10-01 |                2 |           487654 |           675386 |
| 2018-10-02 |                3 |           488756 |           675447 |

我正在尝试通过1个单个SQL查询来执行此操作,但是DML面临一些限制,例如:

-> One single SELECT with all the tables jointed is out of question for performance purpose (20+ tables with millions lines) 
-> I was thinking of going through temporary tables, but it seems I cannot run Multiple DELETE + INSERT functions on several tables with DML
-> I cannot use a wildcard table as the output of the query

有人会知道如何以最佳方式(最好是通过1个单一查询)获得这种结果吗?

0 个答案:

没有答案