我有一个下表,我正在尝试确定每月第一次在公司消费的用户数量。
我想要的是一个具有新用户,月和年作为列的结果表。
在人们对这篇文章投反对票之前,我已经浏览了各种文章,并且似乎找不到解决该问题的类似方法。我下面包含的代码基于我从相关帖子中整理的内容。
这是原始表:
+---------------------+-------------+-----------------+
| datetime | customer_id | amount |
+---------------------+-------------+-----------------+
| 2018-03-01 03:00:00 | 3786 | 14 |
+---------------------+-------------+-----------------+
| 2018-03-02 17:00:00 | 5678 | 25 |
+---------------------+-------------+-----------------+
| 2018-08-17 19:00:00 | 5267 | 45 |
+---------------------+-------------+-----------------+
| 2018-08-25 08:00:00 | 3456 | 78 |
+---------------------+-------------+-----------------+
| 2018-08-25 17:00:00 | 3456 | 25 |
+---------------------+-------------+-----------------+
| 2019-05-25 14:00:00 | 3456 | 15 |
+---------------------+-------------+-----------------+
| 2019-07-02 14:00:00 | 88889 | 45 |
+---------------------+-------------+-----------------+
| 2019-08-25 08:00:00 | 1234 | 88 |
+---------------------+-------------+-----------------+
| 2019-08-30 09:31:00 | 1234 | 30 |
+---------------------+-------------+-----------------+
| 2019-08-30 12:00:00 | 9876 | 55 |
+---------------------+-------------+-----------------+
| 2019-09-01 13:00:00 | 88889 | 23 |
+---------------------+-------------+-----------------+
这是CREATE语句:
CREATE TABLE IF NOT EXISTS `spend` ( `datetime` datetime NOT NULL, `customer_id` int(11) NOT NULL, `amount` int(11) NOT NULL, PRIMARY KEY (`datetime`)) DEFAULT CHARSET=utf8mb4;
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-03-01 03:00:00', 3786, 14);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-03-02 17:00:00', 5678, 25);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-08-17 19:00:00', 5267, 45);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-08-25 08:00:00', 3456, 78);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-08-25 17:00:00', 3456, 25);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-05-25 14:00:00', 3456, 15);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-07-02 14:00:00', 88889, 45);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-08-25 08:00:00', 1234, 88);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-08-30 09:31:00', 1234, 30);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-08-30 12:00:00', 9876, 55);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-09-01 13:00:00', 88889, 23);
这是我想出的代码:
SELECT S.datetime, S.customer_id, S.amount
FROM spend S
INNER JOIN
(SELECT customer_id, MIN(datetime) AS first_occurence
FROM spend
GROUP BY customer_id) X
ON S.customer_id = X.customer_id AND S.datetime = X.first_occurence
这是结果表:
+------------------+-------------+-------+
| datetime | customer_id |amount |
+------------------+-------------+-------+
| 01/03/2018 03:00 | 3786 | 14 |
+------------------+-------------+-------+
| 02/03/2018 17:00 | 5678 | 25 |
+------------------+-------------+-------+
| 17/08/2018 19:00 | 5267 | 45 |
+------------------+-------------+-------+
| 25/08/2018 08:00 | 3456 | 78 |
+------------------+-------------+-------+
| 02/07/2019 14:00 | 88889 | 45 |
+------------------+-------------+-------+
| 25/08/2019 08:00 | 1234 | 88 |
+------------------+-------------+-------+
| 30/08/2019 12:00 | 9876 | 55 |
+------------------+-------------+-------+
这是表格外观的一个示例:
+-----------+-------+------+
| new_users | month | year |
+-----------+-------+------+
| 2 | 3 | 2018 |
+-----------+-------+------+
| 3 | 8 | 2018 |
+-----------+-------+------+
| 1 | 5 | 2019 |
+-----------+-------+------+
| 1 | 7 | 2019 |
+-----------+-------+------+
| 3 | 8 | 2019 |
+-----------+-------+------+
| 1 | 9 | 2019 |
+-----------+-------+------+
答案 0 :(得分:1)
您不需要两级深度子查询。您可以简单地找到客户第一次使用MIN()
花钱,然后只需从该最小日期时间值中提取YEAR()
和MONTH()
来计算用户数:
SELECT
YEAR(min_dt) y,
MONTH(min_dt) m,
COUNT(*) AS new_customers
FROM
(
SELECT customer_id, MIN(datetime) AS min_dt
FROM spend
GROUP BY customer_id
) t
GROUP BY y, m
结果
| y | m | new_customers |
| ---- | --- | ------------- |
| 2018 | 3 | 2 |
| 2018 | 8 | 2 |
| 2019 | 7 | 1 |
| 2019 | 8 | 2 |
答案 1 :(得分:0)
您正确启动了。现在,将其用作子查询以按月获取计数。
SELECT COUNT(*) AS new_users, MONTH(datetime) AS month, YEAR(datetime) AS year
FROM (
SELECT S.datetime, S.customer_id, S.amount
FROM spend S
INNER JOIN
(SELECT customer_id, MIN(datetime) AS first_occurence
FROM spend
GROUP BY customer_id) X
ON S.customer_id = X.customer_id AND S.datetime = X.first_occurence
) AS x
GROUP BY month, year
ORDER BY year, month
实际上,您甚至不需要子查询中的联接,因为您没有在最终结果中使用第一次购买的金额。
SELECT COUNT(*) AS new_users, MONTH(datetime) AS month, YEAR(datetime) AS year
FROM (
SELECT customer_id, MIN(datetime) AS datetime
FROM spend
GROUP BY customer_id
) AS x
GROUP BY month, year
ORDER BY year, month
答案 2 :(得分:0)
具有ROW_NUMBER()窗口功能:
select
count(*) new_users,
month(t.datetime) month,
year(t.datetime) year
from (
select *,
row_number() over (partition by customer_id order by datetime) rn
from spend
) t
where t.rn = 1
group by year, month
order by year, month
有关示例数据,请参见demo。
结果:
| new_users | month | year |
| --------- | ----- | ---- |
| 2 | 3 | 2018 |
| 2 | 8 | 2018 |
| 1 | 7 | 2019 |
| 2 | 8 | 2019 |
答案 3 :(得分:0)
您也可以这样做
select
count(*) new_users,
month(datetime) month,
year(datetime) year
from spend
where datetime in (select min(datetime) from spend group by customer_id)
group by year, month
order by year, month;