更改查询以避免在Bigquery中出现“不允许聚合”

时间:2019-03-25 14:40:20

标签: sql google-bigquery

鉴于用户和订单表,我需要统计注册日期后第二天首次下订单的用户。

我设法通过以下查询列出了此类用户:

SELECT 
  users.first_name as first_name,
  users.last_name as last_name,
  users.registration_date as registration_date,
  min(orders.order_date) as first_order_date
FROM `users_table` as users
  JOIN `orders_table` as orders
  ON users.id = orders.user_id
GROUP BY
  first_name,
  last_name,
  registration_date
HAVING
  date_diff(first_order_date, registration_date, DAY) = 1
ORDER BY
  registration_date ASC
LIMIT 5

结果:

+------------+-----------+-------------------+------------------+
| first_name | last_name | registration_date | first_order_date |
+------------+-----------+-------------------+------------------+
| Albert     | Ellis     | 2013-04-11        | 2013-04-12       |
| Charles    | Moore     | 2014-04-29        | 2014-04-30       |
| Jimmy      | Payne     | 2014-07-01        | 2014-07-02       |
| Angela     | Stanley   | 2014-10-21        | 2014-10-22       |
| Marie      | Bishop    | 2014-11-15        | 2014-11-16       |
+------------+-----------+-------------------+------------------+

现在,我无法全神贯注地数着它们。当我尝试类似的东西时:

SELECT 
  count(date_diff(min(orders.order_date), users.registration_date, DAY) = 1)
FROM `users_table` as users
  JOIN `orders_table` as orders
  ON users.id = orders.user_id

我收到“不允许聚合”的错误。如何修改查询以解决该问题?

3 个答案:

答案 0 :(得分:2)

只需将您的查询放入子查询即可。您已经在选择在注册后第二天订购的客户。所以答案就是查询中的行数

select count(1)
from ( SELECT 
  users.first_name as first_name,
  users.last_name as last_name,
  users.registration_date as registration_date,
  min(orders.order_date) as first_order_date
FROM `users_table` as users
  JOIN `orders_table` as orders
  ON users.id = orders.user_id
GROUP BY
  first_name,
  last_name,
  registration_date
HAVING
  date_diff(first_order_date, registration_date, DAY) = 1 ) x

答案 1 :(得分:1)

以下是用于BigQuery标准SQL

#standardSQL
SELECT COUNT(1) next_day_order_users
FROM `project.dataset.users_table` AS users
JOIN (
  SELECT user_id, MIN(order_date) first_order_date 
  FROM `project.dataset.orders_table`
  GROUP BY user_id
) AS orders
ON users.id = orders.user_id
WHERE DATE_DIFF(first_order_date, registration_date, DAY) = 1

答案 2 :(得分:0)

为什么不仅仅使用JOIN条件?

SELECT COUNT(DISTINCT u.id)
FROM `users_table` u JOIN
     `orders_table` o        
     ON u.id = o.user_id AND
        date_diff(o.order_date, u.registration_date, DAY) = 1;

COUNT(DISTINCT解释了用户一天之内可能有多个订单的事实。