SQL连接条件是A或B,但不是A和B都

时间:2019-02-08 12:34:08

标签: sql sql-server join

我有按年和季度划分的销售数据,而在去年,我想用最后的可用值来填充缺失的季度。

说我们有源表:

+------+---------+-------+--------+
| year | quarter | sales | row_no |
+------+---------+-------+--------+
| 2018 |       1 |  4000 |      5 |
| 2018 |       2 |  6000 |      4 |
| 2018 |       3 |  5000 |      3 |
| 2018 |       4 |  3000 |      2 |
| 2019 |       1 |  8000 |      1 |
+------+---------+-------+--------+

所需结果:

+------+---------+-------+------------------------+
| year | quarter | sales |                        |
+------+---------+-------+------------------------+
| 2018 |       1 |  4000 |                        |
| 2018 |       2 |  6000 |                        |
| 2018 |       3 |  5000 |                        |
| 2018 |       4 |  3000 |                        |
| 2019 |       1 |  8000 |                        |
| 2019 |       2 |  8000 | <repeat the last value |
| 2019 |       3 |  8000 | <repeat the last value |
| 2019 |       4 |  8000 | <repeat the last value |
+------+---------+-------+------------------------+

因此,任务是确定年和季度的笛卡尔坐标,并使相应的或最后的销售与之连接。

此代码使我快到了:

select r.year, k.quarter, t.sales
from (select distinct year        from [MyTable]) r cross join
     (select distinct quarter     from [MyTable]) k left join
     [MyTable] t
     on (r.year = t.year and k.quarter=t.quarter) or row_no=1

如何更正最后一行(加入条件),以使2018年不加倍?

3 个答案:

答案 0 :(得分:3)

一种方法使用外部套用:

select y.year, q.quarter, t.sales
from (select distinct year from [MyTable]) y cross join
     (select distinct quarter from [MyTable]) q outer apply
     (select top (1) t.*
      from [MyTable] t
      where t.year < y.year or
            (t.year = y.year and t.quarter <= q.quarter)
      order by t.year desc, t.quarter desc
     ) t;

对于您的数据量,应该没问题。

一种更有效的方法-假设您仅将值赋给末尾-

select y.year, q.quarter,
       coalesce(t.sales, tdefault.sales)
from (select distinct year from [MyTable]) y cross join
     (select distinct quarter from [MyTable]) q left join
     [MyTable] t
     on t.year = y.year and
        t.quarter = q.quarter cross join
     (select top (1) t.*
      from [MyTable] t
      order by t.year desc, t.quarter desc
     ) tdefault

答案 1 :(得分:1)

使用CTE和某些窗口函数的非常不同的方法。不需要对表进行2次扫描,也不需要三角连接。

WITH VTE AS(
    SELECT *
    FROM (VALUES (2018,1,4000,5),
                 (2018,2,6000,4),
                 (2018,3,5000,3),
                 (2018,4,3000,2),
                 (2019,1,8000,1)) V([Year],[Quarter],sales, row_no)),
CTE AS(
    SELECT Y.Year,
           Q.Quarter,
           V.sales,
           V.row_no,
           COUNT(CASE WHEN V.sales IS NOT NULL THEN 1 END) OVER (ORDER BY Y.[Year], Q.[Quarter]
                                                                 ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Grp
    FROM (VALUES(2018),(2019)) Y([Year])
         CROSS JOIN (VALUES(1),(2),(3),(4)) Q([Quarter])
         LEFT JOIN VTE V ON Y.[Year] = V.[Year] AND Q.[Quarter] = V.[Quarter])
SELECT C.[Year],
       C.[Quarter],
       MAX(C.sales) OVER (PARTITION BY C.Grp) AS Sales
FROM CTE C;

这仅适用于SQL Server 2012+(因为ROWS BETWEEN是SQL Server 2012中引入的),但是,希望您不使用2008,因为它们(几乎)完全不受支持。

答案 2 :(得分:1)

我只会做JOIN

SELECT TT.YEAR, TT.Quarter, COALESCE(T.SALES, MAX(T.SALES) OVER (PARTITION BY TT.YEAR)) AS sales 
FROM (SELECT DISTINCT T.YEAR, TT.Quarter
      FROM [MyTable] T CROSS JOIN
           ( SELECT DISTINCT TT.Quarter FROM [MyTable] TT ) TT
     ) TT LEFT JOIN 
     [MyTable] T 
     ON TT.YEAR = T.YEAR AND TT.Quarter = T.Quarter;

编辑::我只是误解了另外quarter个问题,因此,您需要在APPLY JOIN中使用OUTER

SELECT TT.YEAR, TT.Quarter, COALESCE(T.SALES, T1.SALES) AS Sales 
FROM (SELECT DISTINCT T.YEAR, TT.Quarter
      FROM [MyTable] T CROSS JOIN
           ( SELECT DISTINCT TT.Quarter FROM [MyTable] TT ) TT
     ) TT LEFT JOIN 
     [MyTable] T 
     ON TT.YEAR = T.YEAR AND TT.Quarter = T.Quarter OUTER APPLY 
     ( SELECT TOP (1) T.*
       FROM [MyTable] T
       WHERE T.YEAR = TT.YEAR
       ORDER BY T.Quarter DESC
     ) T1;