您好我试图在过去7天内找到重复的网络访问,我已经构建了一个查询,但运行时间太长。任何帮助优化此查询将非常感激。我正在使用visitorguid找到重复项。
WITH TooClose
AS
(
SELECT
a.visitid AS BeforeID,
b.visitID AS AfterID,
a.omniturecid as [Before om id],
b.omniturecid as [after om id],
a.pubid as [Before pub id],
b.pubid as [after pub id],
a.VisitorGuid as [Before guid],
b.VisitorGuid as [after guid],
a.date as [Before date],
b.date as [after date]
FROM
webvisits a
INNER JOIN WebVisits b ON a.VisitorGuid = b.VisitorGuid
AND a.date < b.Date
AND DATEDIFF(DAY, a.date, b.date) < 7
Where a.Date >= '7/1/2015')
SELECT
*
FROM
TooClose
WHERE
BeforeID NOT IN (SELECT AfterID FROM TooClose)
答案 0 :(得分:0)
如果我正确理解您的问题,您将尝试在过去7天内找到所有重复的网络访问。不确定什么是重复的webvisit,但这是我尝试可能对你有用的东西:
;WITH q1
AS (
SELECT a.VisitorGuid
,a.date
FROM webvisits a
WHERE a.DATE >= DATEADD(DAY, -7, cast(getdate() as date))
)
,q2 AS
(SELECT q1.VisitorGuid
,count(*) as rcount
FROM q1
GROUP BY q1.VisitorGuid
)
SELECT q2.VisitorGuid
FROM q2
WHERE q2.rcount > 1
<强>已更新强>
;WITH q1
AS (
SELECT a.VisitorGuid
,a.date
,a.omniturecid
FROM webvisits a
WHERE a.DATE >= DATEADD(DAY, -7, cast(getdate() as date))
)
,q2 AS
(SELECT q1.VisitorGuid
,count(*) as rcount
FROM q1
GROUP BY q1.VisitorGuid
HAVING Count(*)> 1
)
SELECT q1.VisitorGuid,
q1.omniturecid,
q1.date
FROM q1
INNER JOIN q2 on q1.VisitorGuid = q2.VisitorGuid
答案 1 :(得分:0)
正如其他人所指出的那样,提供样本数据是个好主意。最好是作为sqlfiddle或create和insert语句。这是一种方法,我懒得发明结构并用数据填充它来测试,所以它可能包含一些错误:
SELECT ... FROM (
SELECT VisitorGuid
, date
, lead(date) over (partition by visitorguid
order by date) as next_date
, omniturecid
, lead(omniturecid) over (partition by visitorguid
order by date) as next_omniturecid
, ...
, lead(...) ...
FROM webvisits a
) as x
WHERE DATEDIFF(DAY, date, next_date) < 7