场合
我们有一个PostgreSQL 8.4 数据库,其中包含具有登录日期/时间和每行注销日期/时间的用户会话。我们的Web应用程序会记录此次并处理用户未明确注销的情况(会话超时)。因此,每种情况下都会给出登录日期/时间和注销日期/时间。
目标
我需要每天最大并发会话数的用户统计信息。所以,我可以说以下内容:"在2015-03-16,登录的并发用户达到峰值 6 。"
类似问题
此处已回答类似的问题:SQL max concurrent sessions per hour of day 但是,我无法使解决方案适应我的情况,我希望有一个结果表,显示最大值。每天并发用户会话数,而不是每小时。表格方案也略有不同,因为我的案例中的一行包含登录和注销日期/时间,而在示例中,每行代表登录或注销。此外,问题是基于MS SQL数据库环境而不是PostgreSQL。
考虑
表格方案:
user_id | login_date | login_time | logout_date | logout_time
------------+--------------+--------------+---------------+-------------
USER32 | 2014-03-03 | 08:23:00 | 2014-03-03 | 14:44:00
USER82 | 2014-03-03 | 08:49:00 | 2014-03-03 | 17:18:00
USER83 | 2014-03-03 | 09:40:00 | 2014-03-03 | 17:31:00
USER36 | 2014-03-03 | 09:50:00 | 2014-03-03 | 16:10:00
USER37 | 2014-03-03 | 11:44:00 | 2014-03-03 | 15:21:00
USER72 | 2014-03-03 | 12:52:00 | 2014-03-03 | 12:55:00
示例
以下示例通过Google Charts API显示为时间轴应有助于了解问题:http://i.imgur.com/ZOjnLll.png
鉴于2015-03-03这一天的例子,除了USER78(6个用户)之外的所有用户都在当天的12:52到12:55之间登录。这是同时登录用户的最大数量,我在给定时间范围内每天需要这样的统计数据。
Day | MaxNumberOfConcurrentSessions
------------+--------------------------------
2015-03-01 | 2
2015-03-02 | 3
2015-03-03 | 6
...
上面的时间轴截图示例为Google Charts API。
google.setOnLoadCallback(drawChart);
function drawChart() {
var container = document.getElementById('example5.1');
var chart = new google.visualization.Timeline(container);
var dataTable = new google.visualization.DataTable();
dataTable.addColumn({ type: 'string', id: 'Room' });
dataTable.addColumn({ type: 'string', id: 'Name' });
dataTable.addColumn({ type: 'date', id: 'Start' });
dataTable.addColumn({ type: 'date', id: 'End' });
dataTable.addRows([
["USER78", '', new Date(2014,03,03,20,38), new Date(2014,03,03,21,14)],
["USER83", '', new Date(2014,03,03,09,40), new Date(2014,03,03,17,31)],
["USER72", '', new Date(2014,03,03,08,43), new Date(2014,03,03,08,43)],
["USER72", '', new Date(2014,03,03,09,40), new Date(2014,03,03,09,40)],
["USER72", '', new Date(2014,03,03,10,03), new Date(2014,03,03,10,06)],
["USER72", '', new Date(2014,03,03,12,52), new Date(2014,03,03,12,55)],
["USER72", '', new Date(2014,03,03,21,13), new Date(2014,03,03,21,13)],
["USER72", '', new Date(2014,03,03,21,37), new Date(2014,03,03,21,38)],
["USER72", '', new Date(2014,03,03,23,14), new Date(2014,03,03,23,15)],
["USER72", '', new Date(2014,03,03,23,27), new Date(2014,03,03,23,28)],
["USER36", '', new Date(2014,03,03,08,05), new Date(2014,03,03,09,17)],
["USER36", '', new Date(2014,03,03,09,50), new Date(2014,03,03,16,10)],
["USER36", '', new Date(2014,03,03,16,12), new Date(2014,03,03,20,29)],
["USER32", '', new Date(2014,03,03,08,23), new Date(2014,03,03,14,44)],
["USER82", '', new Date(2014,03,03,08,49), new Date(2014,03,03,17,18)],
["USER37", '', new Date(2014,03,03,08,04), new Date(2014,03,03,08,06)],
["USER37", '', new Date(2014,03,03,11,44), new Date(2014,03,03,15,21)],
["USER37", '', new Date(2014,03,03,15,34), new Date(2014,03,03,15,51)],
["USER37", '', new Date(2014,03,03,16,12), new Date(2014,03,03,16,14)],
["USER37", '', new Date(2014,03,03,16,52), new Date(2014,03,03,16,54)],
["USER37", '', new Date(2014,03,03,17,07), new Date(2014,03,03,17,08)],
["USER37", '', new Date(2014,03,03,20,20), new Date(2014,03,03,20,24)],
["USER37", '', new Date(2014,03,03,21,03), new Date(2014,03,03,21,20)],
["USER37", '', new Date(2014,03,03,22,42), new Date(2014,03,03,23,05)],
["USER37", '', new Date(2014,03,03,23,51), new Date(2014,03,03,23,56)],
["USER01", '', new Date(2014,03,03,16,11), new Date(2014,03,03,16,12)]
]);
var options = {
timeline: { colorByRowLabel: true }
};
chart.draw(dataTable, options);
}

<script type="text/javascript" src="https://www.google.com/jsapi?autoload={'modules':[{'name':'visualization',
'version':'1','packages':['timeline']}]}"></script>
<div id="example5.1" style="width:5000px;height: 600px;"></div>
&#13;
答案 0 :(得分:4)
我会使用UNION ALL
序列化登录和注销,“in”计为1,“out”计为-1。然后使用简单的窗口函数计算运行计数并获得每天的最大值。
由于尚未指定,因此假设:
WITH range AS (SELECT '2014-03-01'::date AS start_date -- time range
, '2014-03-31'::date AS end_date) -- inclusive bounds
, cte AS (
SELECT *
FROM tbl, range r
WHERE login_date <= r.end_date
AND logout_date >= r.start_date
)
, ct AS (
SELECT log_date, sum(ct) OVER (ORDER BY log_date, log_time, ct) AS session_ct
FROM (
SELECT logout_date AS log_date, logout_time AS log_time, -1 AS ct FROM cte
UNION ALL
SELECT login_date, login_time, 1 FROM cte
) sub
)
SELECT log_date, max(session_ct) AS max_sessions
FROM ct, range r
WHERE log_date BETWEEN r.start_date AND r.end_date -- crop actual time range
GROUP BY 1
ORDER BY 1;
您可能使用OVERLAPS
中的cte
运算符:
AND (login_date, logout_date) OVERLAPS (r.start_date, r.end_date)
详细说明:
但这可能不是一个好主意,因为(per documentation):
每个时间段被认为代表半开放 间隔开始&lt; =时间&lt;结束,除非开始和结束相等 它表示单一时刻。这意味着例如 只有一个共同点的两个时间段不重叠。
大胆强调我的。您的范围的上限必须是之后您想要的时间范围。
第一次CTE range
只是为了方便提供一次的时间范围。
第二个CTE`cte'只选择相关的行:那些
第3次CTE ct
序列化“in”和“out”点,其值为+/- 1,并使用用作窗口函数的聚合函数sum()
计算运行计数。这些是available since Postgres 8.4
在最后SELECT
修剪前导和尾随日期,并汇总每天的最大值。瞧。
SQL Fiddle Postgres 8.4太旧了,不再可用,但应该是一样的。我在测试用例中添加了一行 - 一行跨越多天。应该使它更有用。
我通常会使用timestamp
代替date
和time
。尺寸相同,更易于处理。
(login_date, logout_date DESC)
上的索引有助于将性能作为最低限度。
答案 1 :(得分:0)
到目前为止我的想法:
SQL语句如下所示:
SELECT report_date, MAX(concurrent_sessions) AS max_concurrent_sessions FROM(
SELECT report_date, session_id, count(session_id) as concurrent_sessions from (
SELECT s1.id AS session_id, s1.user_id, s1.login_date AS report_date, s1.login_time, s1.logout_date, s1.logout_time, s2.id, s2.user_id, s2. login_date, s2.login_time, s2.logout_date, s2.logout_time
FROM sessions s1
INNER JOIN sessions s2 ON s1.login_date = s2.login_date
WHERE s1.login_date between '2014-03-01' AND '2014-03-31' AND (s1.login_time, s1.logout_time) OVERLAPS (s2.login_time, s2.logout_time) AND s1. login_time >= s2.login_time AND s1.logout_time <= s2.logout_time
ORDER BY s1.id
) AS concurrent_overlapping_sessions
GROUP BY report_date, session_id
) AS max_concurrent_overlapping_sessions
GROUP BY report_date
ORDER BY report_date
与其他提议的解决方案相比,您对此解决方案有何看法(例如性能,正确性等)?