如何按最新值合并记录

时间:2018-02-27 17:28:35

标签: sql-server merge subquery

我有这个:

name time      val1 val2 val3
bill 12/1/2010 2    3    4
bill 12/2/2010 1         5
bill 11/1/2010 1    NULL 5

....

我怎样才能最终得到这个:

name time      val1 val2 val3
bill 12/2/2010 1    3    5

在我的情况下,name是一个唯一值,并保证属于同一个人。我试过了:

select * from table1 v1
INNER JOIN table1 v2 
ON t2.name = t1.name
GROUP BY t1.name;

但这并没有解决如何获得最新价值的问题。我很感激任何关于他们背后的思考过程的建议 - 我很难结合时间戳获得'最新'值和基于他们是否回答问题的'完整'值。

继承T-SQL代码以重现我的(微小)数据集:

CREATE TABLE [dbo].[table1](
    [name] [varchar](50) NOT NULL PRIMARY KEY CLUSTERED,
    [time] [varchar](20) NOT NULL,
    [val1] [varchar](50),
    [val2] [varchar](50),
    [val3] [varchar](50),
)

GO

INSERT INTO table1 VALUES ('bill','12/1/2010','2','3','1');
INSERT INTO table1 VALUES ('bill','12/2/2010','1','','5'); ---NO TEXT ENTERED
INSERT INTO table1 VALUES ('bill','11/1/2010','2',NULL,'1'); ---QUESTION NOT SEEN. PUTS NULL IN RESULT

GO

1 个答案:

答案 0 :(得分:0)

说实话,这比它需要的要困难得多,因为看起来你的数据没有很好地规范化。然后你试图得到一个更进一步的非规范化结果集。如果数据正确标准化,那么这将更加容易。我必须修复你的ddl(不能在名字上有重复的主键)。我还必须修复您的样本数据以匹配您最初发布的内容。

我确信还有其他方法可以解决这个问题,但这就是我想出来的。

CREATE TABLE [dbo].[table1](
    [name] [varchar](50) NOT NULL,
    [time] [varchar](20) NOT NULL,
    [val1] [varchar](50),
    [val2] [varchar](50),
    [val3] [varchar](50),
)

GO

INSERT INTO table1 VALUES ('bill','12/1/2010','2','3','1');
INSERT INTO table1 VALUES ('bill','12/2/2010','1','','5'); ---NO TEXT ENTERED
INSERT INTO table1 VALUES ('bill','11/1/2010','2',NULL,'1'); ---QUESTION NOT SEEN. PUTS NULL IN RESULT
;

with val1 as
(
    select name
        , val1
        , RowNum = ROW_NUMBER() over(partition by name order by time desc)
    from table1
    where val1 > ''
)
, val2 as
(
    select name
        , val2
        , RowNum = ROW_NUMBER() over(partition by name order by time desc)
    from table1
    where val2 > ''
)
, val3 as
(
    select name
        , val3
        , RowNum = ROW_NUMBER() over(partition by name order by time desc)
    from table1
    where val3 > ''
)

select t.name
    , max(t.time)
    , val1 = max(v1.val1)
    , val2 = max(v2.val2)
    , val3 = max(v3.val3)
from table1 t
left join val1 v1 on v1.name = t.name and v1.RowNum = 1
left join val2 v2 on v2.name = t.name and v2.RowNum = 1
left join val3 v3 on v3.name = t.name and v3.RowNum = 1
group by t.name