如何从多行

时间:2016-01-09 17:04:20

标签: sql sql-server tsql

情况如下:

我正在从几个有关公司的来源收集数据,但让我们为您的客户提供方法

对于同一个客户,我可以在同一天或另一天获得多行。我希望使用SCD2来保存历史记录,但是有些资源并未向我提供所有字段的数据。我可以说'N / A'或NULL i 我想这样做

a)如果除日期之外的两行或更多行相同,则应该产生一行具有最早的日期 b)如果更改了一个或多个字段,请创建一个新的scd2行,将更改日期更改为startdate。 c)如果b)中新行中的一个或多个字段已从合法值更改为“N / A”,则它应具有这些字段的最新合法值(来自上一行)

我正在使用sql server和t-sql

我希望我能够清楚地解释清楚: - )

再次感谢

编辑(来自评论):

CustomerHistoryId   |CustomerNum    |CustomerName   |Planet |ChangeDate --------------------------------------------------------------------------------‌​----------------- 
1   |101    |Anakin Skywalker   |Tatooine   |14.03.2015 15:41
2   |102    |Yoda   |Coruscant  |14.03.2015 15:41
3   |103    |Obi-Wan Kenobi |Coruscant  |24.03.2015 15:41
4   |102    |Yoda |Coruscant    |29.03.2015 15:41
5   |102    |Yoda   |NULL   |03.04.2015 15:41
6   |102    |Yoda   |NULL   |04.04.2015 
7   |103    |Obi-Wan Kenobi |Degobah    |08.04.2015 15:41
8   |102    |Master Yoda    |Tatooine   |09.04.2015 15:41
9   |102    |NULL   |Tatooine   |10.04.2015 15:41
10  |102    |Master Yoda    |Tatooine   |11.04.2015 15:41

最终结果:

CustomerHistoryId   |CustomerNum    |CustomerName   |Planet |ChangeDate
1   |101    |Anakin Skywalker   |Tatooine   |14.03.2015 15:41
2   |102    |Yoda   |Coruscant  |14.03.2015 15:41
3   |103    |Obi-Wan Kenobi |Coruscant  |24.03.2015 15:41
7   |103    |Obi-Wan Kenobi |Degobah    |08.04.2015 15:41
8   |102 |Master Yoda   |Tatooine   |09.04.2015 15:41

2 个答案:

答案 0 :(得分:0)

据我所知,您希望忽略具有NULL值的行,然后忽略重复项,忽略日期。假设id和日期以相同的顺序分配,您可以使用聚合执行此操作:

select min(CustomerHistoryId) as CustomerHistoryId,
       CustomerNum, CustomerName, Planet, min(ChangeDate) as ChangeDate
from t
where CustomerName is not null and Planet is not null
group by CustomerNum, CustomerName, Planet;

答案 1 :(得分:0)

| customerhistoryid | customernum |     customername |    planet |              changedate |
|-------------------|-------------|------------------|-----------|-------------------------|
|                 1 |         101 | Anakin Skywalker |  Tatooine | March, 16 2015 22:18:34 |
|                 2 |         102 |             Yoda | Coruscant | March, 16 2015 00:42:34 |
|                 3 |         103 |   Obi-Wan Kenobi | Coruscant | March, 26 2015 22:18:34 |
|                 4 |         102 |             Yoda | Coruscant | March, 16 2015 03:06:34 |
|                 5 |         102 |             Yoda |    (null) | March, 16 2015 05:30:34 |
|                 6 |         102 |             Yoda |     Basic | March, 16 2015 07:54:34 |
|                 7 |         103 |   Obi-Wan Kenobi |   Degobah | April, 10 2015 22:18:34 |
|                 8 |         102 |      Master Yoda |  Tatooine | April, 11 2015 00:42:34 |
|                 9 |         102 |           (null) |  Tatooine | April, 11 2015 03:06:34 |
|                10 |         102 |           (null) | Tatooine2 | April, 11 2015 07:54:34 |
|                11 |         102 |      Master Yoda |  Degobah2 | April, 13 2015 22:18:34 |

数据不是“干净”,例如你有“Yoda”和“Master Yoda”都有相同的customernum值。所以真的应该有一个单独的表,每个customernum包含正确名称的唯一行。但是这不存在。

所以这是一种方法(还有更多的可能性)

select
   MIN(CustomerHistoryId), CustomerNum, CustomerName, Planet, MIN(ChangeDate)
from (
      select
        t.CustomerHistoryId
      , t.CustomerNum
      , COALESCE(t.CustomerName,
                   ( select top (1)
                        t2.CustomerName
                     from t t2
                     where t.CustomerName IS NULL 
                     and t2.CustomerName IS NOT NULL 
                     and t2.CustomerNum = t.CustomerNum
                     and t2.ChangeDate < t.ChangeDate
                     order by t2.ChangeDate DESC
                    )
                 ) AS CustomerName
      , COALESCE(t.Planet,
                   ( select top (1)
                        t2.Planet
                     from t t2
                     where t.Planet IS NULL 
                     and t2.Planet IS NOT NULL 
                     and t2.ChangeDate < t.ChangeDate
                     order by t2.ChangeDate DESC
                    )
                 ) AS Planet
      , t.ChangeDate
      from t
   ) dt
group by
   CustomerNum, CustomerName, Planet
order by
   CustomerNum, MIN(CustomerHistoryId)
;

这是一种相当通用的方法,但您可以使用OUTER APPLY而不是相关的子查询。

从那个查询我得到了这个结果:

| min | customernum |     customername |    planet |                     min |
|-----|-------------|------------------|-----------|-------------------------|
|   1 |         101 | Anakin Skywalker |  Tatooine | March, 16 2015 22:18:34 |
|   2 |         102 |             Yoda | Coruscant | March, 16 2015 00:42:34 |
|   6 |         102 |             Yoda |     Basic | March, 16 2015 07:54:34 |
|   8 |         102 |      Master Yoda |  Tatooine | April, 11 2015 00:42:34 |
|  10 |         102 |      Master Yoda | Tatooine2 | April, 11 2015 07:54:34 |
|  11 |         102 |      Master Yoda |  Degobah2 | April, 13 2015 22:18:34 |
|   3 |         103 |   Obi-Wan Kenobi | Coruscant | March, 26 2015 22:18:34 |
|   7 |         103 |   Obi-Wan Kenobi |   Degobah | April, 10 2015 22:18:34 |

我使用this sqlfiddle(在Postgres中,因为MSSQL当时没有工作)