Question

我正在寻找一种机制来控制每天在多个BigQuery表上导入的数据的准确性。每个表具有类似的格式，带有DATE和ID列。表格格式如下：

Table_1
| DATE       | ID       |
| 2018-10-01 | A        |
| 2018-10-01 | B        |
| 2018-10-02 | A        |
| 2018-10-02 | B        |
| 2018-10-02 | C        |

我要控制的是通过这种输出表来实现ID数量的演变：

CONTROL_TABLE
| DATE       | COUNT(Table1.ID) | COUNT(Table2.ID) | COUNT(Table3.ID) |
| 2018-10-01 |                2 |           487654 |           675386 |
| 2018-10-02 |                3 |           488756 |           675447 |

我正在尝试通过1个单个SQL查询来执行此操作，但是DML面临一些限制，例如：

-> One single SELECT with all the tables jointed is out of question for performance purpose (20+ tables with millions lines) 
-> I was thinking of going through temporary tables, but it seems I cannot run Multiple DELETE + INSERT functions on several tables with DML
-> I cannot use a wildcard table as the output of the query

有人会知道如何以最佳方式（最好是通过1个单一查询）获得这种结果吗？

跨多个表的BigQuery DML COUNT（）

0 个答案: