确定每月第一次在公司上花钱的客户(mySQL)

时间:2019-10-01 17:13:10

标签: mysql sql mysql-8.0

我有一个下表,我正在尝试确定每月第一次在公司消费的用户数量。

我想要的是一个具有新用户,月和年作为列的结果表。

在人们对这篇文章投反对票之前,我已经浏览了各种文章,并且似乎找不到解决该问题的类似方法。我下面包含的代码基于我从相关帖子中整理的内容。

这是原始表:

+---------------------+-------------+-----------------+
| datetime            | customer_id | amount          |
+---------------------+-------------+-----------------+
| 2018-03-01 03:00:00 | 3786        | 14              |
+---------------------+-------------+-----------------+
| 2018-03-02 17:00:00 | 5678        | 25              |
+---------------------+-------------+-----------------+
| 2018-08-17 19:00:00 | 5267        | 45              |
+---------------------+-------------+-----------------+
| 2018-08-25 08:00:00 | 3456        | 78              |
+---------------------+-------------+-----------------+
| 2018-08-25 17:00:00 | 3456        | 25              |
+---------------------+-------------+-----------------+
| 2019-05-25 14:00:00 | 3456        | 15              |
+---------------------+-------------+-----------------+
| 2019-07-02 14:00:00 | 88889       | 45              |
+---------------------+-------------+-----------------+
| 2019-08-25 08:00:00 | 1234        | 88              |
+---------------------+-------------+-----------------+
| 2019-08-30 09:31:00 | 1234        | 30              |
+---------------------+-------------+-----------------+
| 2019-08-30 12:00:00 | 9876        | 55              |
+---------------------+-------------+-----------------+
| 2019-09-01 13:00:00 | 88889       | 23              |
+---------------------+-------------+-----------------+

这是CREATE语句:

CREATE TABLE IF NOT EXISTS `spend` ( `datetime` datetime NOT NULL, `customer_id` int(11) NOT NULL, `amount` int(11) NOT NULL, PRIMARY KEY (`datetime`)) DEFAULT CHARSET=utf8mb4;
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-03-01 03:00:00', 3786, 14);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-03-02 17:00:00', 5678, 25);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-08-17 19:00:00', 5267, 45);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-08-25 08:00:00', 3456, 78);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-08-25 17:00:00', 3456, 25);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-05-25 14:00:00', 3456, 15);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-07-02 14:00:00', 88889, 45);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-08-25 08:00:00', 1234, 88);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-08-30 09:31:00', 1234, 30);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-08-30 12:00:00', 9876, 55);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-09-01 13:00:00', 88889, 23);

这是我想出的代码:

SELECT S.datetime, S.customer_id, S.amount 
FROM spend S
INNER JOIN
    (SELECT customer_id, MIN(datetime) AS first_occurence
    FROM spend
    GROUP BY customer_id) X
ON S.customer_id = X.customer_id AND S.datetime = X.first_occurence

这是结果表:

+------------------+-------------+-------+
| datetime         | customer_id |amount |
+------------------+-------------+-------+
| 01/03/2018 03:00 | 3786        | 14    |
+------------------+-------------+-------+
| 02/03/2018 17:00 | 5678        | 25    |
+------------------+-------------+-------+
| 17/08/2018 19:00 | 5267        | 45    |
+------------------+-------------+-------+
| 25/08/2018 08:00 | 3456        | 78    |
+------------------+-------------+-------+
| 02/07/2019 14:00 | 88889       | 45    |
+------------------+-------------+-------+
| 25/08/2019 08:00 | 1234        | 88    |
+------------------+-------------+-------+
| 30/08/2019 12:00 | 9876        | 55    |
+------------------+-------------+-------+

这是表格外观的一个示例:

+-----------+-------+------+
| new_users | month | year |
+-----------+-------+------+
| 2         | 3     | 2018 |
+-----------+-------+------+
| 3         | 8     | 2018 |
+-----------+-------+------+
| 1         | 5     | 2019 |
+-----------+-------+------+
| 1         | 7     | 2019 |
+-----------+-------+------+
| 3         | 8     | 2019 |
+-----------+-------+------+
| 1         | 9     | 2019 |
+-----------+-------+------+

4 个答案:

答案 0 :(得分:1)

您不需要两级深度子查询。您可以简单地找到客户第一次使用MIN()花钱,然后只需从该最小日期时间值中提取YEAR()MONTH()来计算用户数:

SELECT 
  YEAR(min_dt) y,
  MONTH(min_dt) m,
  COUNT(*) AS new_customers 
FROM 
(
  SELECT customer_id, MIN(datetime) AS min_dt 
  FROM spend 
  GROUP BY customer_id 
) t
GROUP BY y, m

结果

| y    | m   | new_customers |
| ---- | --- | ------------- |
| 2018 | 3   | 2             |
| 2018 | 8   | 2             |
| 2019 | 7   | 1             |
| 2019 | 8   | 2             |

View on DB Fiddle

答案 1 :(得分:0)

您正确启动了。现在,将其用作子查询以按月获取计数。

SELECT COUNT(*) AS new_users, MONTH(datetime) AS month, YEAR(datetime) AS year
FROM (
    SELECT S.datetime, S.customer_id, S.amount 
    FROM spend S
    INNER JOIN
        (SELECT customer_id, MIN(datetime) AS first_occurence
        FROM spend
        GROUP BY customer_id) X
    ON S.customer_id = X.customer_id AND S.datetime = X.first_occurence
) AS x
GROUP BY month, year
ORDER BY year, month

实际上,您甚至不需要子查询中的联接,因为您没有在最终结果中使用第一次购买的金额。

SELECT COUNT(*) AS new_users, MONTH(datetime) AS month, YEAR(datetime) AS year
FROM (
    SELECT customer_id, MIN(datetime) AS datetime
    FROM spend
    GROUP BY customer_id
) AS x
GROUP BY month, year
ORDER BY year, month

答案 2 :(得分:0)

具有ROW_NUMBER()窗口功能:

select 
  count(*) new_users,
  month(t.datetime) month,
  year(t.datetime) year
from (
  select *,
    row_number() over (partition by customer_id order by datetime) rn
  from spend
) t
where t.rn = 1
group by year, month
order by year, month

有关示例数据,请参见demo
结果:

| new_users | month | year |
| --------- | ----- | ---- |
| 2         | 3     | 2018 |
| 2         | 8     | 2018 |
| 1         | 7     | 2019 |
| 2         | 8     | 2019 |

答案 3 :(得分:0)

您也可以这样做

select   
count(*) new_users,
month(datetime) month,
year(datetime) year
from spend
where datetime in (select min(datetime) from spend group by customer_id)
group by year, month
order by year, month;