我有这种数据。这是示例Teradata日志,其中以QueryID级别捕获CPU和IO。我已经解析了与QueryID对应的querytext,以进一步标识其中引用的数据库和表。在解析查询并细分为详细信息后,我无法在详细级别划分所捕获的CPU和IO。它们是该查询的标头级属性
我现在在Datastudio上显示数据。当我占用仪表板上的DatabaseReferred或TablesReferred字段以获取该查询中引用的表的不同计数时,CPU和IO在内部对数据进行UNNEST时被复制,并且当我对其进行汇总时,它会炸毁
您能给我一个想法,如何在每个查询中仅对CPU求和一次,同时仍计算该查询中不同的DatabaseReferred和TablesReferred
输入数据如下
Row Username QueryId CPU IO DatabaseReferred TablesReferred
1) ABC 1234 100 123 DB1 TB1
DB2 TB2
DB1 TB3
2) ABC 8454 589 565 DB1 TB3
DB2 TB6
3) ABC 3564 145 243 DB3 TB4
DB5 TB3
4) PQR 6352 737 562 DB2 TB6
DB1 TB7
DB1 TB2
5) PQR 2345 200 126 DB2 TB5
DB1 TB1
输出如下所示。
Username Count(DistinctQueryID) Sum(CPU) SUM(IO) DistinctDatabaseReferred DistinctTablesReferred
ABC 3 834 931 4 5
PQR 2 937 688 2 5
为了快速参考,我正在准备WITH子句,以供解决方案中使用的输入数据
SELECT 'ABC' username, cast('1234' as int64) QueryID, cast('100' as int64) CPU, cast('123' as int64) IO, ['DB1','DB2','DB1'] DatabaseReferred, ['TB1','TB2','TB3'] TablesReferred
UNION ALL
SELECT 'ABC' username, cast('8454' as int64) QueryID, cast('589' as int64) CPU, cast('565' as int64) IO, ['DB1','DB2'] DatabaseReferred, ['TB3','TB6'] TablesReferred
UNION ALL
SELECT 'ABC' username, cast('3564' as int64) QueryID, cast('145' as int64) CPU, cast('243' as int64) IO, ['DB3','DB5'] DatabaseReferred, ['TB4','TB3'] TablesReferred
UNION ALL
SELECT 'PQR' username, cast('6352' as int64) QueryID, cast('737' as int64) CPU, cast('562' as int64) IO, ['DB2','DB1','DB1'] DatabaseReferred, ['TB6','TB7','TB2'] TablesReferred
UNION ALL
SELECT 'PQR' username, cast('2345' as int64) QueryID, cast('200' as int64) CPU, cast('126' as int64) IO, ['DB2','DB1'] DatabaseReferred, ['TB5','TB1'] TablesReferred
答案 0 :(得分:1)
以下是用于BigQuery标准SQL
#standardSQL
SELECT
Username,
Count_of_Distinct_QueryId,
Sum_CPU,
Sum_IO,
(SELECT COUNT(DISTINCT db) FROM t.dbs AS db) AS DistinctDatabaseReferred,
(SELECT COUNT(DISTINCT tbl) FROM t.tbls AS tbl) AS DistinctTablesReferred
FROM (
SELECT Username,
COUNT(DISTINCT QueryId) AS Count_of_Distinct_QueryId,
SUM(CPU) AS Sum_CPU,
SUM(IO) AS Sum_IO,
ARRAY_CONCAT_AGG(DatabaseReferred) dbs,
ARRAY_CONCAT_AGG(TablesReferred) tbls
FROM `project.dataset.table`
GROUP BY Username
) t
如果要应用于您的问题的样本数据-输出为
Row Username Count_of_Distinct_QueryId Sum_CPU Sum_IO DistinctDatabaseReferred DistinctTablesReferred
1 ABC 3 834 931 4 5
2 PQR 2 937 688 2 5