我在表格中有以下数据,我想报告而不必删除任何行。
ActiveSearchID --- SearchDate --------------------- SearchPhrase
1 --------------------- 2010-12-15 12:01:11.587 --- argos
2 --------------------- 2010-12-15 12:03:40.193 ---无印良品
3 --------------------- 2010-12-15 12:03:42.370 ---无印良品
4 --------------------- 2010-12-15 12:04:29.167 ---办公用品
5 --------------------- 2010-12-15 12:05:11.590 ---熔岩
9 --------------------- 2010-12-15 12:08:38.920 --- sony vaio
10 ------------------- 2010-12-15 12:08:41.170 --- sony vaio
12 ------------------- 2010-12-15 12:09:09.920 --- sony vaio电池
13 ------------------- 2010-12-15 12:09:17.487 --- sony vaio battery
14 ------------------- 2010-12-15 12:17:10.980 --- sony vaio battery
15 ------------------- 2010-12-15 12:17:12.170 --- argos
我想要的报告是选择在5分钟间隔内搜索过的第一个搜索短语实例。
因此,例如查询没有上述信息将导致以下结果:
SearchDate ---------------- SearchPhrase
2010-12-15 12:01:11.587 --- argos
2010-12-15 12:03:40.193 ---无印良品
2010-12-15 12:04:29.167 ---办公用品
2010-12-15 12:05:11.590 ---熔岩
2010-12-15 12:08:38.920 --- sony vaio
2010-12-15 12:09:09.920 --- sony vaio电池
2010-12-15 12:17:12.170 --- argos
我尝试了以下查询,但我仍然得到重复:
选择t1.searchdate,t1.searchphrase 来自activesearches t1 内联接activesesearch t2 on t1.searchphrase = t2.searchphrase 和t1.searchdate< t2.searchdate 其中datediff(s,t1.searchdate,t2.searchdate)< = 300 按searchdate排序
我想使用“WITH SearchPhrases AS()”类型的查询,但我无法理解它。
由于
答案 0 :(得分:0)
我相信鉴于您的测试数据“sony vaio battery”应该已经退回两次了。我想出了两个选择。
-- Populate test data
if(OBJECT_ID('tempdb..#Search') IS NOT NULL)
DROP TABLE #Search
create table #Search (
ActiveSearchID int primary key,
SearchDate datetime not null,
SearchPhrase nvarchar(30))
insert into #Search(ActiveSearchID, SearchDate, SearchPhrase)
select 1, '2010-12-15 12:01:11.587', 'argos'
union all select 2, '2010-12-15 12:03:40.193', 'muji'
union all select 3, '2010-12-15 12:03:42.370', 'muji'
union all select 4, '2010-12-15 12:04:29.167', 'Office supplies'
union all select 5, '2010-12-15 12:05:11.590', 'lava'
union all select 9, '2010-12-15 12:08:38.920', 'sony vaio'
union all select 10, '2010-12-15 12:08:41.170', 'sony vaio'
union all select 12, '2010-12-15 12:09:09.920', 'sony vaio battery'
union all select 13, '2010-12-15 12:09:17.487', 'sony vaio battery'
union all select 14, '2010-12-15 12:17:10.980', 'sony vaio battery'
union all select 15, '2010-12-15 12:17:12.170', 'argos'
我认为您正在寻找类似此查询的内容。我不知道这会如何表现:
select *
from #Search as S
where not exists(
select * from #Search as N
where N.SearchPhrase= S.SearchPhrase
and N.SearchDate between
dateadd(minute, -5, S.SearchDate) AND S.SearchDate
and N.ActiveSearchID <> S.ActiveSearchID)
或者,如果您可以在时钟上使用谨慎的5分钟间隔,这可能会表现得更好 - 我没有使用大量数据进行测试:
select
ActiveSearchID, SearchDate, SearchPhrase
from
(
select
*,
ROW_NUMBER() over (
partition by SearchPhrase,
DATEDIFF(minute, '2000-01-01', SearchDate) / 5
order by SearchDate, ActiveSearchID) as rn,
DATEDIFF(minute, '2000-01-01', SearchDate) as five_minute_window
from #Search
) as X
where
rn = 1
order by
ActiveSearchID