特性

ObjectId是通过管道的项目的唯一ID，因此实际上它们很少（1个条目或每个管道组件，其中大约有10个）
期待每天~10K插入
性能是一项要求（所以EXISTS（...）可能不是一个选项）
硬件很稳定，它是一个数据中心SQL机器，但它与许多其他团队/流程共享。

我遇到的问题/我正在尝试的事情：

这是一个设计，所以我没有实际的数据。我应该有一个概念验证数据库来测试
以下是我一直在考虑尝试的内容：

select  objectid, time, eventtype
from        objects
where       -- can't use time < @t because I won't get the later events
group by    objectid
having      --

或

select        objectid as oid, time, eventtype
from          objects
where     eventtype = 1
and       time < @t
and       exists (select  objectid, eventtype, time
              where   objectid = oid -- not sure if this is legal
              and     eventtype = 2
              and     time > @t)

你可能会说，我不会写很多SQL所以我忘了一点。

实施例

ID  objectid    eventtype   time
1   12345   1   09:00 AM
2   12345   2   10:00 AM


eventtypeid     description 
1           "enter house"
2           "leave house"
3           "enter work"

所以，4名受试者于上午9点进入房子，并于上午11点离开，我正试图看看他们是否在上午10点在房子里。 12345是主题的“名称/号码”

在这个例子中，我正在尝试查询该主题是否在上午10:00在房子里。完全可能是主题进入了房子，但从未离开过，我不希望这些用于此查询。

问题

我是否在正确的轨道上？
我如何估计第二个查询的预期性能（假设它有效）？
指针？建议？实例

一切都很受欢迎。

Answer 1

对于给定主题和给定时间，您可以：

select top 1 o.*
from objects o
where eventtime < @t and
      objectid = @objectid
order by eventtime desc;

使用Windows函数将此扩展到多个对象是最简单的：

select o.*
from (select o.*,
             row_number() over (partition by objectid order by eventtime desc) as seqnum
      from objects o
      where eventtime < @t
     ) o
where seqnum = 1;

这两个都为您提供有关在给定时间之前（严格地在之前）的最后一个事件的信息。

Answer 2

我对你的SQL很感到困惑，但是从你所说的你想要的最新对象看来，它可能是基于对象引用已经存在的。我可能正在咆哮错误的树，因为我对SQL感到困惑，但是根据你的要求，apears你想要一个通过一个共同的objectid分组的对象，这个对象并不总是唯一的，并且是一个泛型的类型。 / p>

这可能对你有所帮助。它基本上总是通过对象ID计数来计算欺骗，但我不确定你是否也限制了范围，所以我把它留下来以防万一。然后在第二次迭代中，通过obj对dupe进行分区，然后将类型的范围限制为变量，如果您只关心一种类型的话。您也可以在第一次迭代中执行此操作。如果遇到null，则假定类型表示“一切”。我在生产环境中使用了类似这样的方法，所以只要你在适当的位置有索引它就应该是可靠的。即Type和Datetime字段上的索引。示例是自解压缩，将在SQL Management Studio 2008及更高版本中运行，并自动填充表变量。

declare @Object Table ( objectId int , typ varchar(2), obj varchar(8), dt datetime);

insert into @Object values (1, 'A', 'Brett', getdate() - 0.8) ,(1,'A','Sean', getdate() - 0.4),(1,'A','Brett', getdate() - 0.08),(2,'A','Michael', getdate() - 0.04)
,(2,'B','Ray', getdate() - 0.008),(3, 'B', 'Erik', getdate() - 0.004),(3, 'C', 'Ray', getdate() - 0.0001);

-- objects as they are
Select *
from @Object
;

-- Find dupe objects by two distinctions
select 
    obj
,   count(objectId) over(partition by obj) as rowOccurencesByTyp
,   count(objectId) over(partition by typ, obj) as rowOccurencesByTypAndObj
from @Object
;


-- limit scope by type
declare 
-- CHANGE LINE AS NEEDED TO TEST HOW IT WORKS FOR 'A', 'B' OR NULL
    @Type varchar(2) = NULL  
-- Scope range of datetime too if you want
,   @dt datetime
;


-- Find dupes first
with dupes as 
    (
        select 
        obj
    ,   typ
    ,   dt
    ,   count(objectId) over(partition by obj) as rowOccurencesByTyp
    ,   count(objectId) over(partition by typ, obj) as rowOccurencesByTypAndObj
    -- I made the Ray occurence be in DIFFERENT Types so this would be an edge case you may not want
    from @Object
    -- WHERE CLAUSE WOULD BE HERE WITH DATE RANGE.  I was lazy in my example and made it small but you could 
    -- easily limit scope of dupes by a date range of 'dt between @Start and @End' or 'dt < @dt' or 'dt > @dt'
    )
-- if you merely want to get the most recent objects you can do a windowed function to get them quite easily
, a as 
    (
    select 
        *
    ,   row_number() over(partition by obj order by dt desc) as rwn
    -- I am find the ranking by shared objectid and then ordering by date descending(most current first). 
    -- You wish to also add the 'typ' before the objectID as I was not sure
    from dupes
    where typ = isnull(@Type, typ)  -- limit scope by type potentially
        and rowOccurencesByTyp > 1
    -- you may set up other rowOccurrences here if that suits you better.
    )
select *
from a
where rwn = 1  
-- recently inserted double is a dupe, determining scope of dupe is done by
-- the most recent 'rwn' finding a repeat insert of a row from part 1 
-- ordered by date descending and grouped by it's object

Answer 3

很难理解你究竟是在追求什么，但是为了返回存在后一行的行，简单地说，在SQL2012中可以使用LEAD函数来完成：

DROP TABLE #test
CREATE TABLE #test (VALUE CHAR(25)) 
INSERT INTO #test VALUES('abcde'),('asaf'),('dogs'),(NULL),('')

SELECT Value, LEAD(Value,1,'Last Record') OVER (ORDER BY Value)
FROM #test

SELECT *
FROM (SELECT Value, LEAD(Value,1,'Last Record') OVER (ORDER BY Value)'Last_Flag'
       FROM #test
      )sub
WHERE Last_Flag <> 'Last Record'

从上面创建的测试表中，下一个查询从下一行中提取一个值（第二个参数“1”定义了每个行的偏移量，你想要查看的行数），“最后一个记录”是如果没有下一行，则播种为默认值（NULL默认情况下，除非您的数据具有NULLS，否则我很喜欢以一种情况为种子播种）。然后最后一个选择除了之后没有行的那个以外的所有内容。

如果你想知道每个objectID，IE：

，你可以添加一个PARTITION BY语句

LEAD(Value,1,'Last Record') OVER (PARTITION BY objectID ORDER BY Value)

SQL Help：返回后一行存在的行的查询

特性

实施例

问题

3 个答案: