MySQL计算每日新用户VS返回用户(队列分析)

时间:2015-05-27 16:59:55

标签: mysql sql statistics

表结构是:user_id,Date(我用来处理时间戳)

例如

user id | Date (TS)
A       | '2014-08-10 14:02:53' 
A       | '2014-08-12 14:03:25' 
A       | '2014-08-13 14:04:47'
B       | '2014-08-13 04:04:47'
...

并且下周我有

user id | Date (TS)
A       | '2014-08-17 09:02:53'     
B       | '2014-08-17 10:04:47'
B       | '2014-08-18 10:04:47'
A       | '2014-08-19 10:04:22'
C       | '2014-08-19 11:04:47'
...

今天我有

user id | Date (TS)
A       | '2015-05-27 09:02:53'     
B       | '2015-05-27 10:04:47'
C       | '2015-05-27 10:04:22'
D       | '2015-05-27 17:04:47'

我需要知道如何执行单个查询以查找"返回的用户数量#34;用户从一开始就活动。

预期结果:

date        | New user | returned User
2014-08-10  |  1       | 0
2014-08-11  |  0       | 0
2014-08-12  |  0       | 1 (A was active on 08/11)
2014-08-13  |  1       | 1 (A was active on 08/12 & 08/11)
...
2014-08-17  |  0       | 2 (A & B were already active )
2014-08-18  |  0       | 1 
2014-08-19  |  1       | 1 
...
2015-05-27  |  1       | 3 (D is a new user) 

经过对Stackoverflow的长时间搜索后,我在https://meta.stackoverflow.com/users/107744/spencer7593找到了一些材料:Weekly Active Users for each day from log但我没有成功更改查询以输出我的预期结果。

感谢您的帮助

2 个答案:

答案 0 :(得分:3)

假设你在某个地方有一个日期表(并且使用t-sql语法,因为我知道它更好......)关键是分别计算每个用户的心态,计算当天的用户总数,然后只是声明一个返回的用户是一个不新的用户:

SELECT DateTable.Date, NewUsers, NumUsers - NewUsers AS ReturningUsers
FROM
DateTable
    LEFT JOIN
        (
        SELECT MinDate, COUNT(user_id) AS NewUsers
        FROM (
                SELECT user_id, min(CAST(date AS Date)) as MinDate
                FROM Table
                GROUP BY user_id
            ) A
        GROUP BY MinDate
        ) B ON DateTable.Date = B.MinDate
    LEFT JOIN
        (
        SELECT CAST(date AS Date) AS Date, COUNT(DISTINCT user_id) AS NumUsers
        FROM Table
        GROUP CAST(date AS Date)
        ) C ON DateTable.Date = C.Date

答案 1 :(得分:2)

感谢Stephen,我对他的查询做了一个简短的修复,即使在大型数据库上耗费一些时间也很有效:

SELECT 
    DATE(Stats.Created),
    NewUsers,
    NumUsers - NewUsers AS ReturningUsers
FROM
    Stats
LEFT JOIN
    (
        SELECT
            MinDate,
            COUNT(user_id) AS NewUsers
        FROM (
            SELECT
                user_id,
                MIN(DATE(Created)) as MinDate
            FROM Stats
            GROUP BY user_id
        ) A
        GROUP BY MinDate
    ) B
ON DATE(Stats.Created) = B.MinDate
LEFT JOIN
    (
        SELECT 
            DATE(Created) AS Date,
            COUNT(DISTINCT user_id) AS NumUsers
        FROM Stats
        GROUP BY DATE(Created)
    ) C
ON DATE(Stats.Created) = C.Date
GROUP BY DATE(Stats.Created)