针对日期范围的SQL连接?

时间:2010-02-21 16:11:05

标签: sql sql-server tsql join date-range

考虑两个表:

交易,金额为外币:

     Date  Amount
========= =======
 1/2/2009    1500
 2/4/2009    2300
3/15/2009     300
4/17/2009    2200
etc.

ExchangeRates ,以外币的主要货币(比如说美元)的价值:

     Date    Rate
========= =======
 2/1/2009    40.1
 3/1/2009    41.0
 4/1/2009    38.5
 5/1/2009    42.7
etc.

可以输入任意日期的汇率 - 用户可以每天,每周,每月或不定期地输入汇率。

为了将外国金额换算成美元,我需要遵守这些规则:

一个。如果可能,请使用最近的先前费率;因此,2009年2月4日的交易使用2009年2月1日的汇率,2009年3月15日的交易使用3/1/2009的汇率。

B中。如果没有为之前日期定义的费率,请使用最早的可用费率。因此,1/2/2009的交易使用了2009年2月1日的汇率,因为没有定义更早的汇率。

这有效......

Select 
    t.Date, 
    t.Amount,
    ConvertedAmount=(   
        Select Top 1 
            t.Amount/ex.Rate
        From ExchangeRates ex
        Where t.Date > ex.Date
        Order by ex.Date desc
    )
From Transactions t

...但是(1)看起来连接效率更高&优雅,(2)它不涉及上述规则B.

是否可以使用子查询来找到合适的费率?是否有一种优雅的方式来处理规则B,而不是将自己束缚在结?

6 个答案:

答案 0 :(得分:20)

您可以先对按日期排序的汇率进行自我加入,这样您就可以获得每种汇率的开始和结束日期,而日期中没有任何重叠或差距(可以将其添加为视图)你的数据库 - 就我而言我只是使用公用表表达式。)

现在,将这些“准备好的”费率加入交易是简单而有效的。

类似的东西:

WITH IndexedExchangeRates AS (           
            SELECT  Row_Number() OVER (ORDER BY Date) ix,
                    Date,
                    Rate 
            FROM    ExchangeRates 
        ),
        RangedExchangeRates AS (             
            SELECT  CASE WHEN IER.ix=1 THEN CAST('1753-01-01' AS datetime) 
                    ELSE IER.Date 
                    END DateFrom,
                    COALESCE(IER2.Date, GETDATE()) DateTo,
                    IER.Rate 
            FROM    IndexedExchangeRates IER 
            LEFT JOIN IndexedExchangeRates IER2 
            ON IER.ix = IER2.ix-1 
        )
SELECT  T.Date,
        T.Amount,
        RER.Rate,
        T.Amount/RER.Rate ConvertedAmount 
FROM    Transactions T 
LEFT JOIN RangedExchangeRates RER 
ON (T.Date > RER.DateFrom) AND (T.Date <= RER.DateTo)

备注:

  • 您可以将GETDATE()替换为遥远未来的日期,我假设此处未知未来的费率。

  • 规则(B)是通过将第一个已知汇率的日期设置为SQL Server datetime支持的最小日期来实现的,这应该(根据定义,如果它是您的类型)用于Date列)是可能的最小值。

答案 1 :(得分:3)

假设您有一个包含以下内容的扩展汇率表:

 Start Date   End Date    Rate
 ========== ========== =======
 0001-01-01 2009-01-31    40.1
 2009-02-01 2009-02-28    40.1
 2009-03-01 2009-03-31    41.0
 2009-04-01 2009-04-30    38.5
 2009-05-01 9999-12-31    42.7

我们可以讨论是否应该合并前两行的细节,但一般的想法是找到给定日期的汇率是微不足道的。此结构与SQL'BETWEEN'运算符一起使用,该运算符包括范围的末尾。通常,范围更好的格式是“开放式”;列出的第一个日期包括在内,第二个日期被排除在外。请注意,数据行存在约束 - (a)日期范围的覆盖范围没有间隙,(b)覆盖范围内没有重叠。实施这些限制并非完全无足轻重(礼貌的低调 - 减数分裂)。

现在基本查询很简单,案例B不再是特例:

SELECT T.Date, T.Amount, X.Rate
  FROM Transactions AS T JOIN ExtendedExchangeRates AS X
       ON T.Date BETWEEN X.StartDate AND X.EndDate;

棘手的部分是动态地从给定的ExchangeRate表创建ExtendedExchangeRate表。 如果它是一个选项,那么修改基本的ExchangeRate表的结构以匹配ExtendedExchangeRate表将是一个好主意;您可以在输入数据时(每月一次)解决混乱的问题,而不是每次需要确定汇率时(一天多次)。

如何创建扩展汇率表?如果您的系统支持从日期值中添加或减去1以获取下一天或前一天(并且有一个名为“Dual”的行表),那么 这将工作(不使用任何OLAP函数):

CREATE TABLE ExchangeRate
(
    Date    DATE NOT NULL,
    Rate    DECIMAL(10,5) NOT NULL
);
INSERT INTO ExchangeRate VALUES('2009-02-01', 40.1);
INSERT INTO ExchangeRate VALUES('2009-03-01', 41.0);
INSERT INTO ExchangeRate VALUES('2009-04-01', 38.5);
INSERT INTO ExchangeRate VALUES('2009-05-01', 42.7);

第一行:

SELECT '0001-01-01' AS StartDate,
       (SELECT MIN(Date) - 1 FROM ExchangeRate) AS EndDate,
       (SELECT Rate FROM ExchangeRate
         WHERE Date = (SELECT MIN(Date) FROM ExchangeRate)) AS Rate
FROM Dual;

结果:

0001-01-01  2009-01-31      40.10000

最后一行:

SELECT (SELECT MAX(Date) FROM ExchangeRate) AS StartDate,
       '9999-12-31' AS EndDate,
       (SELECT Rate FROM ExchangeRate
         WHERE Date = (SELECT MAX(Date) FROM ExchangeRate)) AS Rate
FROM Dual;

结果:

2009-05-01  9999-12-31      42.70000

中间行:

SELECT X1.Date     AS StartDate,
       X2.Date - 1 AS EndDate,
       X1.Rate     AS Rate
  FROM ExchangeRate AS X1 JOIN ExchangeRate AS X2
       ON X1.Date < X2.Date
 WHERE NOT EXISTS
       (SELECT *
          FROM ExchangeRate AS X3
         WHERE X3.Date > X1.Date AND X3.Date < X2.Date
        );

结果:

2009-02-01  2009-02-28      40.10000
2009-03-01  2009-03-31      41.00000
2009-04-01  2009-04-30      38.50000

请注意,NOT EXISTS子查询非常重要。没有它,'中间行'的结果是:

2009-02-01  2009-02-28      40.10000
2009-02-01  2009-03-31      40.10000    # Unwanted
2009-02-01  2009-04-30      40.10000    # Unwanted
2009-03-01  2009-03-31      41.00000
2009-03-01  2009-04-30      41.00000    # Unwanted
2009-04-01  2009-04-30      38.50000

随着表格大小的增加,不需要的行数会急剧增加(对于N> 2行,我相信有(N-2)*(N-3)/ 2个不需要的行。)

ExtendedExchangeRate的结果是三个查询的(不相交)UNION:

SELECT DATE '0001-01-01' AS StartDate,
       (SELECT MIN(Date) - 1 FROM ExchangeRate) AS EndDate,
       (SELECT Rate FROM ExchangeRate
         WHERE Date = (SELECT MIN(Date) FROM ExchangeRate)) AS Rate
FROM Dual
UNION
SELECT X1.Date     AS StartDate,
       X2.Date - 1 AS EndDate,
       X1.Rate     AS Rate
  FROM ExchangeRate AS X1 JOIN ExchangeRate AS X2
       ON X1.Date < X2.Date
 WHERE NOT EXISTS
       (SELECT *
          FROM ExchangeRate AS X3
         WHERE X3.Date > X1.Date AND X3.Date < X2.Date
        )
UNION
SELECT (SELECT MAX(Date) FROM ExchangeRate) AS StartDate,
       DATE '9999-12-31' AS EndDate,
       (SELECT Rate FROM ExchangeRate
         WHERE Date = (SELECT MAX(Date) FROM ExchangeRate)) AS Rate
FROM Dual;

在测试DBMS(MacOS X 10.6.2上的IBM Informix Dynamic Server 11.50.FC6)上,我能够将查询转换为视图但我不得不停止使用数据类型作弊 - 通过将字符串强制转换为日期:

CREATE VIEW ExtendedExchangeRate(StartDate, EndDate, Rate) AS
    SELECT DATE('0001-01-01')  AS StartDate,
           (SELECT MIN(Date) - 1 FROM ExchangeRate) AS EndDate,
           (SELECT Rate FROM ExchangeRate WHERE Date = (SELECT MIN(Date) FROM ExchangeRate)) AS Rate
    FROM Dual
    UNION
    SELECT X1.Date     AS StartDate,
           X2.Date - 1 AS EndDate,
           X1.Rate     AS Rate
      FROM ExchangeRate AS X1 JOIN ExchangeRate AS X2
           ON X1.Date < X2.Date
     WHERE NOT EXISTS
           (SELECT *
              FROM ExchangeRate AS X3
             WHERE X3.Date > X1.Date AND X3.Date < X2.Date
            )
    UNION 
    SELECT (SELECT MAX(Date) FROM ExchangeRate) AS StartDate,
           DATE('9999-12-31') AS EndDate,
           (SELECT Rate FROM ExchangeRate WHERE Date = (SELECT MAX(Date) FROM ExchangeRate)) AS Rate
    FROM Dual;

答案 2 :(得分:1)

我无法测试这个,但我认为它会起作用。它使用coalesce和两个子查询来按规则A或规则B选择速率。

Select t.Date, t.Amount, 
  ConvertedAmount = t.Amount/coalesce(    
    (Select Top 1 ex.Rate 
        From ExchangeRates ex 
        Where t.Date > ex.Date 
        Order by ex.Date desc )
     ,
     (select top 1 ex.Rate 
        From ExchangeRates  
        Order by ex.Date asc)
    ) 
From Transactions t

答案 3 :(得分:0)

SELECT 
    a.tranDate, 
    a.Amount,
    a.Amount/a.Rate as convertedRate
FROM
    (

    SELECT 
        t.date tranDate,
        e.date as rateDate,
        t.Amount,
        e.rate,
        RANK() OVER (Partition BY t.date ORDER BY
                         CASE WHEN DATEDIFF(day,e.date,t.date) < 0 THEN
                                   DATEDIFF(day,e.date,t.date) * -100000
                              ELSE DATEDIFF(day,e.date,t.date)
                         END ) AS diff
    FROM 
        ExchangeRates e
    CROSS JOIN 
        Transactions t
         ) a
WHERE a.diff = 1

计算转差和利率日期之间的差异,然后将负值(条件b)乘以-10000,这样它们仍然可以排名但是正值(条件a始终优先。然后我们选择最小日期差异为使用rank over子句的每个转换日期。

答案 4 :(得分:0)

许多解决方案都可行。您应该找到最适合您的工作负载的那个:您通常会搜索一个事务,它们的列表,所有这些吗?

给定您的架构的决胜局解决方案是:

SELECT      t.Date,
            t.Amount,
            r.Rate
            --//add your multiplication/division here

FROM        "Transactions" t

INNER JOIN  "ExchangeRates" r
        ON  r."ExchangeRateID" = (
                        SELECT TOP 1 x."ExchangeRateID"
                        FROM        "ExchangeRates" x
                        WHERE       x."SourceCurrencyISO" = t."SourceCurrencyISO" --//these are currency-related filters for your tables
                                AND x."TargetCurrencyISO" = t."TargetCurrencyISO" --//,which you should also JOIN on
                                AND x."Date" <= t."Date"
                        ORDER BY    x."Date" DESC)

您需要为此查询提供正确的索引。理想情况下,您不应在JOIN"Date",而应在"ID" - 类似字段(INTEGER)上。给我更多架构信息,我将为您创建一个示例。

答案 5 :(得分:0)

联接中没有任何内容比原始帖子中的TOP 1相关子查询更优雅。但是,正如您所说,它不满足要求B.

这些查询确实有效(需要SQL Server 2005或更高版本)。请参阅the SqlFiddle for these

SELECT
   T.*,
   ExchangeRate = E.Rate
FROM
  dbo.Transactions T
  CROSS APPLY (
    SELECT TOP 1 Rate
    FROM dbo.ExchangeRate E
    WHERE E.RateDate <= T.TranDate
    ORDER BY
      CASE WHEN E.RateDate <= T.TranDate THEN 0 ELSE 1 END,
      E.RateDate DESC
  ) E;

请注意,具有单个列值的CROSS APPLY在功能上等同于您显示的SELECT子句中的相关子查询。我现在更喜欢CROSS APPLY,因为它更灵活,允许您在多个位置重复使用该值,其中包含多行(用于自定义透视)并允许您拥有多个列。

SELECT
   T.*,
   ExchangeRate = Coalesce(E.Rate, E2.Rate)
FROM
  dbo.Transactions T
  OUTER APPLY (
    SELECT TOP 1 Rate
    FROM dbo.ExchangeRate E
    WHERE E.RateDate <= T.TranDate
    ORDER BY E.RateDate DESC
  ) E
  OUTER APPLY (
    SELECT TOP 1 Rate
    FROM dbo.ExchangeRate E2
    WHERE E.Rate IS NULL
    ORDER BY E2.RateDate
  ) E2;

我不知道哪一个可能表现得更好,或者哪一个会比页面上的其他答案表现更好。使用Date列上的适当索引,它们应该很好 - 绝对比任何Row_Number()解决方案都好。