我有一个加载到Bigquery的Apache组合日志文件。其中包含一个由resource,place_id,ip,start_time,end_time,device,status组成的模式。我正在尝试运行一个查询,该查询计算资源数量和设备数量,并按资源和设备对它们进行分组。
表:
resource | place_id | device | ip | status |
-----------------------------------------------------------------
/resource1 | 6750320008 | android | x.x.x.x | 200 |
/resource1 | 6750320100 | ipad | x.x.x.y | 200 |
/resource2 | 6750320008 | android | x.x.x.z | 200 |
查询:
SELECT resource, device
FROM (
Select
EXACT_COUNT_DISTINCT(resource) AS URL,
1 AS scalar,
FROM ([daily_logs.app_logs_data])
WHERE place_id = '6750320008' GROUP BY URL) AS datal
JOIN (
SELECT
COUNT(device) as DeviceCount,
1 AS scalar
FROM ([daily_logs.app_logs_data]) GROUP BY DeviceCount) AS y
ON datal.scalar=y.scalar
我收到此错误:Error: Cannot group by an aggregate.
我基本上是在同一个表中创建两个表来计算不同的项目,然后我想将它们连接在一起,但是按照这样的顺序对它们进行分组:
URL | totalresourcecount | device | totaldevicecount
-----------------------------------------------------------------
/resource1 | 1 | android | 1
/resource1 | 1 | ipad | 1
/resource2 | 1 | android | 1
我已阅读google bigquery语法帮助并查看了一些示例,但没有任何内容产生所需的结果。提前谢谢!
答案 0 :(得分:1)
以下是BigQuery Standard SQL,反映了后续评论中提供的逻辑
#standardSQL
SELECT resource, device, COUNT(1) cnt
FROM `project.dataset.yourtable`
WHERE place_id = '6750320008'
GROUP BY resource, device
您可以使用以下虚拟数据进行上述测试/播放
#standardSQL
WITH `project.dataset.yourtable` AS (
SELECT '/resource1' resource, '6750320008' place_id, 'android' device, 'x.x.x.x' ip, 200 status UNION ALL
SELECT '/resource1', '6750320100', 'ipad', 'x.x.x.y', 200 UNION ALL
SELECT '/resource2', '6750320008', 'android', 'x.x.x.z', 200
)
SELECT resource, device, COUNT(1) cnt
FROM `project.dataset.yourtable`
WHERE place_id = '6750320008'
GROUP BY resource, device
请注意 - 上述内容取决于我如何理解您在后续评论中表达的查询逻辑