如何确定两条记录是否相隔1年(使用时间戳)

时间:2012-03-05 17:56:17

标签: sql sql-server sql-server-2005 tsql

我需要分析一些博客,并确定用户是否曾访问过一次,休息一年,再次访问过。我希望使用符合上述条件的VisitId为每一行(Y / N)添加一个标记。

我将如何创建这个sql?

以下是我所拥有的字段,我认为需要使用(通过分析每次访问的第一页的时间戳):

  • VisitID - 每次访问都有唯一的ID(即12356,12345,16459)
  • UserID - 每个用户都有一个Id(即steve = 1,ted = 2,mark = 12345等等)
  • TimeStamp - 如下所示:2010-01-01 00:32:30.000

select VisitID, UserID, TimeStamp from page_view_t where pageNum = 1;

谢谢 - 非常感谢任何帮助。

3 个答案:

答案 0 :(得分:5)

您可以对每个用户的行进行排名,然后将排名的行集加入到自身中以比较相邻的行:

;
WITH ranked AS (
  SELECT
    *,
    rnk = ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY TimeStamp)
  FROM page_view_t
),
flagged AS (
  SELECT
    *,
    IsReturnVisit = CASE
      WHEN EXISTS (
        SELECT *
        FROM ranked
        WHERE UserID = r.UserID
          AND rnk = r.rnk - 1
          AND TimeStamp <= DATEADD(YEAR, -1, r.TimeStamp)
      )
      THEN 'Y'
      ELSE 'N'
    END
  FROM ranked r
)
SELECT
  VisitID,
  UserID,
  TimeStamp,
  IsReturnVisit
FROM flagged

注意:上述标志仅返回访问次数。

<强>更新

要将首次访问标记为返回访问次数,可以按如下方式修改flagged CTE:

…
SELECT
  *,
  IsFirstOrReturnVisit = CASE
    WHEN p.UserID IS NULL OR r.TimeStamp >= DATEADD(YEAR, 1, p.TimeStamp)
    THEN 'Y'
    ELSE 'N'
  END
FROM ranked r
  LEFT JOIN ranked p ON r.UserID = p.UserID AND r.rnk = p.rnk + 1
…

可能有用的参考文献:

答案 1 :(得分:1)

另一个人的速度更快,但是因为我花了很多时间做这件事而且这是一个完全不同的方法,我不妨发布它:D。

SELECT pv2.VisitID,
       pv2.UserID,  
       pv2.TimeStamp, 
       CASE WHEN pv1.VisitID IS NOT NULL 
             AND pv3.VisitID IS NULL
       THEN 'YES' ELSE 'NO' END AS IsReturnVisit
FROM page_view_t pv2
LEFT JOIN page_view_t pv1 ON pv1.UserID = pv2.UserID
                          AND pv1.VisitID <> pv2.VisitID
                          AND (pv1.TimeStamp <= DATEADD(YEAR, -1, pv2.TimeStamp)
                               OR pv2.TimeStamp <= DATEADD(YEAR, -1, pv1.TimeStamp))
                          AND pv1.pageNum = 1
LEFT JOIN page_view_t pv3 ON pv1.UserID = pv3.UserID
                          AND (pv3.TimeStamp BETWEEN pv1.TimeStamp AND pv2.TimeStamp
                               OR pv3.TimeStamp BETWEEN pv2.TimeStamp AND pv1.TimeStamp)
                          AND pv3.pageNum = 1
WHERE pv2.pageNum = 1

答案 2 :(得分:1)

假设page_view_t表存储了用户每次访问的UserID和TimeStamp详细信息,以下查询将返回已访问过的用户在两次连续访问之间休息至少一年(365天)。

select t1.UserID
from page_view_t t1
where (
    select datediff(day, max(t2.[TimeStamp]), t1.[TimeStamp])
    from page_view_t t2
    where t2.UserID = t1.UserID and t2.[TimeStamp] < t1.[TimeStamp]
    group by t2.UserID
) >= 365