TSQL:如何将条件应用于子分组

时间:2012-12-10 00:18:30

标签: tsql group-by

图片我有下表,其中包含不同时期的单个人的多个代码(id是主键)

id   code   Name  Start                     Finish
325  1353   Bob   NULL                      2012-07-03 16:21:16.067
1742 1353   Bob   2012-07-03 16:21:16.067   2012-08-03 15:56:29.897
1803 1353   Bob   2012-08-03 15:56:29.897   NULL
17   575    Bob   NULL                      NULL
270  834    Bob   NULL                      2012-07-20 15:51:19.913
1780 834    Bob   2012-07-20 15:51:19.913   2012-07-26 16:26:54.413
1789 834    Bob   2012-07-26 16:26:54.413   2012-08-21 15:36:58.940
1830 834    Bob   2012-08-21 15:36:58.940   2012-08-24 14:26:05.890
1835 834    Bob   2012-08-24 14:26:05.890   2012-08-30 12:01:05.313
1838 123    Bob   2012-08-30 12:01:05.313   2012-09-05 09:29:02.497
1844 900    Bob   2012-09-05 09:29:02.497   NULL

我想要做的是更新表格,以便代码来自最新的人。

id   code   Name Start                      Finish
325  900    Bob  NULL                       2012-07-03 16:21:16.067
1742 900    Bob  2012-07-03 16:21:16.067    2012-08-03 15:56:29.897
1803 900    Bob  2012-08-03 15:56:29.897    NULL
17   900    Bob  NULL                       NULL
270  900    Bob  NULL                       2012-07-20 15:51:19.913
1780 900    Bob  2012-07-20 15:51:19.913    2012-07-26 16:26:54.413
1789 900    Bob  2012-07-26 16:26:54.413    2012-08-21 15:36:58.940
1830 900    Bob  2012-08-21 15:36:58.940    2012-08-24 14:26:05.890
1835 900    Bob  2012-08-24 14:26:05.890    2012-08-30 12:01:05.313
1838 900    Bob  2012-08-30 12:01:05.313    2012-09-05 09:29:02.497
1844 900    Bob  2012-09-05 09:29:02.497    NULL

最新人被定义为latest (max?) Start AND (Finish IS NULL or Finish >= GetDate()) WITHIN the Group of people of same Name AND Code

的人

在上面的示例中,id = 1844(Bob的组有最新的Start,Finish是Null)

我非常确定这可以通过单个语句实现,但我可以看到如何定义“最新人员”,以便我可以将其加入以获取我想要更新的行

编辑:请注意,我不能仅依赖Id列的排序日期列。

1 个答案:

答案 0 :(得分:1)

这样的事情会:

update this set code = (
   select top (1) that.code from table1 that
   where that.name = this.name -- match on name
     and (that.Finish is null or that.Finish >= getdate()) -- filter for current rows only
   order by that.Start desc, that.id desc -- rank by start, break ties with id
   )
from table1 this

我希望您的表格编入索引,并且/或者不要太大,因为这样做一步到位就很昂贵。

替代形式,使用OUTER APPLY,更容易扩展:

update this set code = that.code
from table1 this
outer apply (
   select top (1) that.code from table1 that
   where that.name = this.name -- match on name
     and (that.Finish is null or that.Finish >= getdate()) -- filter for current rows
   order by that.Start desc, that.id desc -- rank by start, break ties with id
   ) that

使用窗口函数的替代方法,没有连接:

update this set code = _latest_code
from (
  -- identify the latest code per name
  select *, _latest_code = max(
    case 
      when (finish is null or finish >= getdate()) 
       and _row_number = 1
      then code else null 
    end
    ) over (partition by name)
  from (
    -- identify the latest row per name
    select *, _row_number = row_number() over (
      partition by name order by 
        case when finish is null or finish >= getdate() then 0 else 1 end
      , start desc, id desc)
    from table1
    ) this
  ) this