每次登录用户访问网站时,他们的数据都会被放入一个包含userId和日期的表格中(每个用户每天一行或零行):
444631 2011-11-07
444631 2011-11-06
444631 2011-11-05
444631 2011-11-04
444631 2011-11-02
444631 2011-11-01
当我从主用户表中提取用户数据时,我需要准备好访问连续访问次数。对于此用户,它将是4。
目前我通过主用户表中的非规范化consecutivevisits
计数器执行此操作,但由于未知原因,它有时会重置..我想尝试一种仅使用上表中数据的方法
获得该数字的最佳SQL查询是什么(上例中为4)?有些用户有数百次访问,我们每天有数百万注册用户和点击量。
编辑:根据下面的评论,我发布了我目前用来执行此操作的代码;然而,它有一个问题,它有时会无缘无故地重置,它也会在周末重置所有人,很可能是因为DST的变化。
// Called every page load for logged in users
public static function OnVisit($user)
{
$lastVisit = $user->GetLastVisit(); /* Timestamp; db server is on the same timezone as www server */
if(!$lastVisit)
$delta = 2;
else
{
$today = date('Y/m/d');
if(date('Y/m/d', $lastVisit) == $today)
$delta = 0;
else if(date('Y/m/d', $lastVisit + (24 * 60 * 60)) == $today)
$delta = 1;
else
$delta = 2;
}
if(!$delta)
return;
$visits = $user->GetConsecutiveVisits();
$userId = $user->GetId();
/* NOTE: t_dailyvisit is the table I pasted above. The table is unused;
* I added it only to ensure that the counter sometimes really resets
* even if the user visits the website, and I could confirm that. */
q_Query("INSERT IGNORE INTO `t_dailyvisit` (`user`, `date`) VALUES ($userId, CURDATE())", DB_DATABASE_COMMON);
/* User skipped 1 or more days.. */
if($delta > 1)
$visits = 1;
else if($delta == 1)
$visits += 1;
q_Query("UPDATE `t_user` SET `consecutivevisits` = $visits, `lastvisit` = CURDATE(), `nvotesday` = 0 WHERE `id` = $userId", DB_DATABASE_COMMON);
$user->ForceCacheExpire();
}
答案 0 :(得分:3)
我错过了mysql标签并写了这个解决方案。遗憾的是,这在MySQL中不起作用,因为它不支持窗口函数。
无论如何我发布了它,因为我付出了一些努力。用PostgreSQL测试。与Oracle或SQL Server(或支持窗口函数的任何其他合适的RDBMS)的工作方式类似。
CREATE TEMP TABLE v(id int, visit date);
INSERT INTO v VALUES
(444631, '2011-11-07')
,(444631, '2011-11-06')
,(444631, '2011-11-05')
,(444631, '2011-11-04')
,(444631, '2011-11-02')
,(444631, '2011-11-01')
,(444632, '2011-12-02')
,(444632, '2011-12-03')
,(444632, '2011-12-05');
-- add 1 to "difference" to get number of days of the longest period
SELECT id, max(dur) + 1 as max_consecutive_days
FROM (
-- calculate date difference of min and max in the group
SELECT id, grp, max(visit) - min(visit) as dur
FROM (
-- consecutive days end up in a group
SELECT *, sum(step) OVER (ORDER BY id, rn) AS grp
FROM (
-- step up at the start of a new group of days
SELECT id
,row_number() OVER w AS rn
,visit
,CASE WHEN COALESCE(visit - lag(visit) OVER w, 1) = 1
THEN 0 ELSE 1 END AS step
FROM v
WINDOW w AS (PARTITION BY id ORDER BY visit)
ORDER BY 1,2
) x
) y
GROUP BY 1,2
) z
GROUP BY 1
ORDER BY 1
LIMIT 1;
输出:
id | max_consecutive_days
--------+----------------------
444631 | 4
我后来找到了更好的方法。 grp
数字不连续(但不断上升)。没关系,因为这些只是达到目的的意思:
SELECT id, max(dur) + 1 AS max_consecutive_days
FROM (
SELECT id, grp, max(visit) - min(visit) AS dur
FROM (
-- subtract an integer representing the number of day from the row_number()
-- creates a "group number" (grp) for consecutive days
SELECT id
,EXTRACT(epoch from visit)::int / 86400
- row_number() OVER (PARTITION BY id ORDER BY visit) AS grp
,visit
FROM v
ORDER BY 1,2
) x
GROUP BY 1,2
) y
GROUP BY 1
ORDER BY 1
LIMIT 1;
答案 1 :(得分:1)
如果没有必要每天都有用户登录网站的日志,而您只想知道他登录的连续几天,我希望这样:
选择3列:LastVisit(Date),ConsecutiveDays(int)和User。
登录时,检查用户的条目,确定上次访问是否为“今天 - 1”,然后将1添加到ConsecutiveDays列并在LastVisit列中存储“今天”。如果最后一个vist大于“Today - 1”,则在ConsecutiveDays中存储1。
HTH