如何为我在查询中选择的字段寻址一个类别?

时间:2019-07-04 14:57:54

标签: sql google-bigquery

不确定是否可以,但是我有一个如下表:

URL | amount | date | ...........

值URL可以是如下所示的URL:

https://www.example.com/category1/subcategory1/....... | 1243 | 01-01-1999
https://www.example.com/category1/subcategory2/....... | 4325 | 01-02-1999
https://www.example.com/category1/subcategory2/....... | 23 | 01-02-1999
https://www.example.com/category2/subcategory1/....... | 12543 | 01-01-1999
https://www.example.com/category2/subcategory2/....... | 124453 | 01-01-1999

如何获得对查询中已经存在的URL进行分组/分类的结果?我正在寻找的结果是:

category1 | average(amount) | 01-01-1999
category1 | average(amount) | 01-02-1999
category2 | average(amount) | 01-01-1999

使用Google BigQuery并查找可以执行此操作的示例查询。

3 个答案:

答案 0 :(得分:1)

以下是用于BigQuery标准SQL

#standardSQL
SELECT 
  REGEXP_EXTRACT(url, CONCAT(r'', NET.REG_DOMAIN(url), '/([^/]*)/')) AS category, 
  AVG(amount) AS avg_amount, date
FROM `project.dataset.table`
GROUP BY category, date

注意,以上解决方案还涵盖了以下情况

www.example.com/category2/subcategory2/......., 124453, '01-01-1999' 

答案 1 :(得分:0)

如果我们将网址除以'/',那么这似乎是第四个元素。所以:

select split(url, '/')[ordinal(4)] as category, date, avg(amount)
from t
group by category, date;

答案 2 :(得分:0)

  1. 创建时态表
declare @t table (category varchar, amount bigint, date date)
  1. 从源表插入已处理的数据。
 insert into @t
 select f_findSring(column1) -- this function returns category1, category2 and so...
       ,amount
       ,date
  1. 从时态表中查询
 select category
        ,average(amount)
        ,date
from @t
group by category, date