SQL插入缺少的日期

时间:2016-01-13 03:02:13

标签: sql sql-server partitioning

使用SQL Server 2012我有一个名为Allbucket的表

CustodianAccountNum symbol  EndDate ManagerName MarketValue NetReturn
A9G040819   wabix   12/31/2013  GMO Benchmark   34751.10987 0.004072
A9G040819   wabix   1/31/2014   GMO Benchmark   34128.88767 -0.017905
A9G040819   wabix   2/28/2014   GMO Benchmark   49969.8081  0.0202
A9G040819   wabix   3/31/2014   GMO Benchmark   50370.993   0.008028
A9G040819   wabix   4/30/2014   GMO Benchmark   50995.0584  0.012389
A9G040819   amj 12/31/2013  JPMorgan Alerian    1234.55 -0.008154
A9G040819   amj 2/28/2014   JPMorgan Alerian    14849.76    -0.018599
A9G040819   amj 3/31/2014   JPMorgan Alerian    14892.8 0.015203
A9G040819   amj 4/30/2014   JPMorgan Alerian    15513.6 0.041684

我正在尝试将这些数据从一个系统加载到另一个系统。但是,它需要每个给定的CustodianAccountNum,因为所有符号在它们都存在的时间段内具有相同的日期间隔。

请注意,amj缺失1/31/2014。线索是至少一个其他安全性,在这种情况下,wabix在相同的时间段内具有该日期。另请注意,有时日期是月内,例如1/15/2014

我希望做一些类似于自我加入和分区的事情,其中​​我为给定的CustodianAccountNum采用所有可能的不同日期,然后强制所有行在它们重叠的时间段内具有相同的周期性。对于非原始的插值行和从该时间跨度中存在的另一个符号“借用”,我想从该符号的前一行拉出LAG市场值(如果前一行存在,则为0)并强制所有其他值为零。原始数据中还有其他列,但我试图保持此示例简单。

所以理想情况下AMJ会是这样的,因为wabix有1/31/2014的日期

CustodianAccountNum symbol  EndDate ManagerName MarketValue NetReturn
   A9G040819    amj 12/31/2013  JPMorgan Alerian    1234.55 -0.008154
A9G040819       amj 1/31/2014   JPMorgan Alerian    1234.55 -0.0
    A9G040819   amj 2/28/2014   JPMorgan Alerian    14849.76    -0.018599
    A9G040819   amj 3/31/2014   JPMorgan Alerian    14892.8 0.015203
    A9G040819   amj 4/30/2014   JPMorgan Alerian    15513.6 0.041684

缺少日期的指导原则是,如果任何其他符号具有由给定的custodianaccountnum分配的日期。有数千个不同的帐户,但他们只需要按照给定的帐户对齐

我担心的是每个帐户的符号生命周期内的日期差距。如果在它之前几年存在另一个符号名称,我不需要添加多少个月。我只需要在给定符号的第一个日期到最后一个日期同步,所有符号在时间上重叠。

更新

Gordon Linoff的回复让我很接近,但并不完全在那里。我不得不将OUTER APPLY更改为CROSS apply或者我在所有列中都获得了数千个空记录。

我已经修改了查询以显示所有需要的列但是这个查询导致除了市场价值之外的所有值= 0.基本上我想强制派生行的所有值为0(1/31/2014 in我的例子)除了市场价值,我想从以前的市场价值中拉出来。但是对于所有非派生行,我想在整行中使用原始值。

select 

ab.drank,d.EndDate,ab.BranchName,ab.EntityID,ab.CustodianAccountNum,ab.AccountID,ab.ManagerName,
ab.FTAssetStyle,ab.FTAssetClass,ab.PWMSecurityID,ab.AssetClassCode,ab.AssetClass,ab.Symbol,ab.SecType,
ab.Cusip,ab.Held,ab.MarketValue,
0 AS GrossFlow,0 AS GrossWeight,0 AS GrossReturn,0 AS NetFlow,0 AS NetWeight,
0 AS NetReturn,0 AS PortfolioFees,0 AS PortfolioExpenses,0 AS ManagerFees,0 AS Income

from (select distinct CustodianAccountNum, enddate from Allbucket) d join
   (select distinct CustodianAccountNum, symbol from Allbucket) s
   on d.CustodianAccountNum = s.CustodianAccountNum CROSS apply
   (select top 1 ab.*
   from Allbucket ab
   where d.CustodianAccountNum = ab.CustodianAccountNum and
      d.enddate <= ab.enddate and
      s.symbol = ab.symbol
            AND ab.CustodianAccountNum = 'A9G040819'
   order by d.enddate desc
   ) ab

2 个答案:

答案 0 :(得分:0)

您可以基本上使用cross join生成行。在这种情况下,它实际上是CustodianAccountNum的不同日期和符号的连接,但它仍然是笛卡尔积。

然后,可以使用CustodianAccountNum选择symbolEndDateouter apply组合的最新记录。

以下是略有不同的变化。这使用left join引入匹配记录,然后在没有匹配时使用来自两个记录的信息。我不确定哪些列应为0,但想法是:

select ab.CustodianAccountNum, ab.symbol, d.EndDate, ab.ManagerName,
       ab.MarketValue, 0 as NetReturn,
       ab.xxx,                      -- for columns that come from the current row
       coalesce(ab.yyy, abprev.yyy) -- for columns from the previous row
from (select distinct CustodianAccountNum, enddate from Allbucket) d join
     (select distinct CustodianAccountNum, symbol from Allbucket) s
     on d.CustodianAccountNum = s.CustodianAccountNum left join
     Allbucket ab
     on d.CustodianAccountNum = ab.CustodianAccountNum and
        d.enddate <= ab.enddate and
        s.symbol = ab.symbol outer apply
     (seleect top 1 ab.*
      from Allbucket ab
      where d.CustodianAccountNum = ab.CustodianAccountNum and
            d.enddate < ab.enddate and
            s.symbol = ab.symbol
      order by d.enddate desc
     ) abprev

答案 1 :(得分:0)

稍微不同的方法,但仍然使用笛卡尔积和APPLY运算符(在这个中需要OUTER APPLY)。要获得0,你不希望先前的值结转,只需相应地修改COALESCE()。

SQL Fiddle

MS SQL Server 2014架构设置

CREATE TABLE Allbucket
    ([CustodianAccountNum] varchar(9), [symbol] varchar(5), [EndDate] datetime, [ManagerName] varchar(16), [MarketValue] numeric
     , [NetReturn] decimal(12,6))
;

INSERT INTO Allbucket
    ([CustodianAccountNum], [symbol], [EndDate], [ManagerName], [MarketValue], [NetReturn])
VALUES
    ('A9G040819', 'wabix', '2013-12-31 00:00:00', 'GMO Benchmark', 34751.10987, 0.004072),
    ('A9G040819', 'wabix', '2014-01-31 00:00:00', 'GMO Benchmark', 34128.88767, -0.017905),
    ('A9G040819', 'wabix', '2014-02-28 00:00:00', 'GMO Benchmark', 49969.8081, 0.0202),
    ('A9G040819', 'wabix', '2014-03-31 00:00:00', 'GMO Benchmark', 50370.993, 0.008028),
    ('A9G040819', 'wabix', '2014-04-30 00:00:00', 'GMO Benchmark', 50995.0584, 0.012389),
    ('A9G040819', 'amj', '2013-12-31 00:00:00', 'JPMorgan Alerian', 1234.55, -0.008154),
    ('A9G040819', 'amj', '2014-02-28 00:00:00', 'JPMorgan Alerian', 14849.76, -0.018599),
    ('A9G040819', 'amj', '2014-03-31 00:00:00', 'JPMorgan Alerian', 14892.8, 0.015203),
    ('A9G040819', 'amj', '2014-04-30 00:00:00', 'JPMorgan Alerian', 15513.6, 0.041684)
;

查询1

SELECT
      s.CustodianAccountNum
    , s.symbol
    , d.enddate
    , COALESCE(ab.ManagerName, ap.ManagerName) AS ManagerName
    , COALESCE(ab.MarketValue, ap.MarketValue) AS MarketValue
    , COALESCE(ab.NetReturn, 0) AS NetReturn
FROM (
      SELECT
            CustodianAccountNum
          , symbol
          , MIN(enddate) symstart
          , MAX(enddate) symend
      FROM Allbucket
      GROUP BY
            CustodianAccountNum
          , symbol
      ) s
      JOIN (
            SELECT DISTINCT
                  cast(enddate as date) as enddate
            FROM Allbucket
      ) d ON d.enddate BETWEEN s.symstart AND s.symend
      LEFT JOIN Allbucket ab ON s.CustodianAccountNum = ab.CustodianAccountNum
                  AND s.symbol = ab.symbol
                  AND ab.enddate = d.enddate
      OUTER APPLY (
            SELECT TOP 1
                  t.*
            FROM Allbucket t
            WHERE s.CustodianAccountNum = t.CustodianAccountNum
                  AND s.symbol = t.symbol
                  AND d.enddate <= t.enddate
            ORDER BY
                  d.enddate DESC
      ) ap

<强> Results

| CustodianAccountNum | symbol |    enddate |      ManagerName | MarketValue | NetReturn |
|---------------------|--------|------------|------------------|-------------|-----------|
|           A9G040819 |    amj | 2013-12-31 | JPMorgan Alerian |        1235 | -0.008154 |
|           A9G040819 |    amj | 2014-01-31 | JPMorgan Alerian |       14850 |         0 |
|           A9G040819 |    amj | 2014-02-28 | JPMorgan Alerian |       14850 | -0.018599 |
|           A9G040819 |    amj | 2014-03-31 | JPMorgan Alerian |       14893 |  0.015203 |
|           A9G040819 |    amj | 2014-04-30 | JPMorgan Alerian |       15514 |  0.041684 |
|           A9G040819 |  wabix | 2013-12-31 |    GMO Benchmark |       34751 |  0.004072 |
|           A9G040819 |  wabix | 2014-01-31 |    GMO Benchmark |       34129 | -0.017905 |
|           A9G040819 |  wabix | 2014-02-28 |    GMO Benchmark |       49970 |    0.0202 |
|           A9G040819 |  wabix | 2014-03-31 |    GMO Benchmark |       50371 |  0.008028 |
|           A9G040819 |  wabix | 2014-04-30 |    GMO Benchmark |       50995 |  0.012389 |

nb:您可以使用ISNULL()而不是COALESCE()

[EDITS]对NetValue上的数据类型进行了修正,&amp;将enddate更改为date,但这是可选的