选择每个用户的最新和第二次最新日期行

时间:2019-03-08 18:20:27

标签: sql sql-server-2014

我有以下查询来选择LAST_UPDATE_DATE字段要获取其日期值大于或等于最近7天的记录的行,效果很好。

   SELECT 'NEW ROW' AS 'ROW_TYPE', A.EMPLID, B.FIRST_NAME, B.LAST_NAME,
    A.BANK_CD, A.ACCOUNT_NUM, ACCOUNT_TYPE, PRIORITY, A.LAST_UPDATE_DATE
   FROM PS_DIRECT_DEPOSIT D
    INNER JOIN PS_DIR_DEP_DISTRIB A ON A.EMPLID = D.EMPLID AND A.EFFDT = D.EFFDT
    INNER JOIN PS_EMPLOYEES B ON B.EMPLID = A.EMPLID
   WHERE 
    B.EMPL_STATUS NOT IN ('T','R','D')
    AND ((A.DEPOSIT_TYPE = 'P' AND A.AMOUNT_PCT = 100)
          OR A.PRIORITY = 999
          OR A.DEPOSIT_TYPE = 'B')
    AND A.EFFDT = (SELECT MAX(A1.EFFDT)
                   FROM PS_DIR_DEP_DISTRIB A1
                   WHERE A1.EMPLID = A.EMPLID
                    AND A1.EFFDT <= GETDATE())
    AND D.EFF_STATUS = 'A'
    AND D.EFFDT = (SELECT MAX(D1.EFFDT)
                   FROM PS_DIRECT_DEPOSIT D1
                   WHERE D1.EMPLID = D.EMPLID
                    AND D1.EFFDT <= GETDATE())
    AND A.LAST_UPDATE_DATE >= GETDATE() - 7

我要添加的是在每个EMPLID上还添加前一行(第二个MAX),以便我可以输出“旧”行(在最后更新之前,符合上述条件的最新行) ,以及我已经在查询中输出的新行。

ROW_TYPE      EMPLID    FIRST_NAME   LAST_NAME      BANK_CD     ACCOUNT_NUM     ACCOUNT_TYPE    PRIORITY    LAST_UPDATE_DATE
NEW ROW       12345     JOHN         SMITH          123548999   45234879        C               999         2019-03-06 00:00:00.000
OLD ROW       12345     JOHN         SMITH          214080046   92178616        C               999         2018-10-24 00:00:00.000
NEW ROW       56399     CHARLES      MASTER         785816167   84314314        C               999         2019-03-07 00:00:00.000   
OLD ROW       56399     CHARLES      MASTER         345761227   547352          C               999         2017-05-16 00:00:00.000

因此EMPLID将按NEW ROW的顺序排列,然后按OLD ROW的顺序排列,如上所示。在此示例中,“新行”正在获取过去7天内的记录,如LAST_UPDATE_DATE所示。

我想获得有关如何修改查询的反馈,因此我还可以获得“旧”行(这是最大行,小于上面检索的“新”行)。

1 个答案:

答案 0 :(得分:0)

那一天在哥谭(Gotham)犯罪很漫长,所以我给了一个旋转。可能会工作。

这不太可能立即解决,但应该可以帮助您入门。

您的LAST_UPDATE_DATE列位于表PS_DIR_DEP_DISTRIB上,因此我们将从此处开始。首先,您要确定最近7天内更新的所有记录,因为这些记录是您唯一感兴趣的记录。在此过程中,我假设并且很可能是错误的,因为该表由EMPLIDBANK_CDACCOUNT_NUM组成。您需要在一些地方为这些列提供实际的自然键。也就是说,日期限制器看起来像这样:

  SELECT
    EMPLID
   ,BANK_CD
   ,ACCOUNT_NUM
  FROM
    PS_DIR_DEP_DISTRIB AS limit
  WHERE
    limit.LAST_UPDATE_DATE >= DATEADD(DAY, -7, CAST(GETDATE() AS DATE))
    AND 
    limit.LAST_UPDATE_DATE <= CAST(GETDATE() AS DATE)

现在,我们将在WHERE EXISTS子句中将其用作相关子查询,并将其关联回基表,以将自身限制为具有上周更新的自然键值的记录。我将SELECT列表更改为SELECT 1,这是相关子的典型用法,因为它在找到一个(1)时就停止寻找匹配项,并且实际上并没有返回任何值全部。

另外,由于无论如何我们都在过滤该记录集,因此我将该表的所有其他WHERE子句过滤器都移到了此子查询中。

最后,在SELECT部分,我添加了DENSE_RANK来强制对记录进行排序。稍后,我们将使用DENSE_RANK值来过滤掉只关注的前( N )条记录。

因此,我们有了这个:

SELECT
  EMPLID
 ,BANK_CD
 ,ACCOUNT_NUM
 --,ACCOUNT_TYPE --Might belong here. Can't tell without table alias in original SELECT
 ,PRIORITY
 ,EFFDT
 ,LAST_UPDATE_DATE
 ,DEPOSIT_TYPE
 ,AMOUNT_PCT
 ,DENSE_RANK() OVER (PARTITION BY --Add actual natural key columns here...
                       EMPLID
                     ORDER BY
                       LAST_UPDATE_DATE DESC
                    ) AS RowNum
FROM
  PS_DIR_DEP_DISTRIB AS sdist
WHERE
  EXISTS
    (
      -- Get the set of records that were last updated in the last 7 days.
      -- Correlate to the outer query so it only returns records related to this subset.
      -- This uses a correlated subquery. A JOIN will work, too. Try both, pick the faster one.
      -- Something like this, using the actual natural key columns in the WHERE
      SELECT
        1
      FROM
        PS_DIR_DEP_DISTRIB AS limit
      WHERE
        --The first two define the date range.
        limit.LAST_UPDATE_DATE >= DATEADD(DAY, -7, CAST(GETDATE() AS DATE))
        AND limit.LAST_UPDATE_DATE <= CAST(GETDATE() AS DATE)
        AND
        --And these are the correlations to the outer query.
        limit.EMPLID = sdist.EMPLID
        AND limit.BANK_CD = sdist.BANK_CD
        AND limit.ACCOUNT_NUM = sdist.ACCOUNT_NUM
    )
  AND
  (
    dist.DEPOSIT_TYPE = 'P'
    AND dist.AMOUNT_PCT = 100
  )
  OR dist.PRIORITY = 999
  OR dist.DEPOSIT_TYPE = 'B'

使用该查询将原来的INNER JOIN替换为PS_DIR_DEP_DISTRIB。在SELECT列表中,第一个硬编码值现在取决于RowNum值,因此现在是一个CASE表达式。在WHERE子句中,日期全部由子查询驱动,因此日期消失了,其中的几个折叠到了子查询中,并且我们添加了WHERE dist.RowNum <= 2以带回前2个记录。

(我还替换了所有表别名,以便可以跟踪正在查看的内容。)

SELECT
  CASE dist.RowNum
    WHEN 1 THEN 'NEW ROW'
    ELSE 'OLD ROW' 
  END AS ROW_TYPE
 ,dist.EMPLID
 ,emp.FIRST_NAME
 ,emp.LAST_NAME
 ,dist.BANK_CD
 ,dist.ACCOUNT_NUM
 ,ACCOUNT_TYPE
 ,dist.PRIORITY
 ,dist.LAST_UPDATE_DATE
FROM
  PS_DIRECT_DEPOSIT AS dd
INNER JOIN
  (
    SELECT
      EMPLID
     ,BANK_CD
     ,ACCOUNT_NUM
     --,ACCOUNT_TYPE --Might belong here. Can't tell without table alias in original SELECT
     ,PRIORITY
     ,EFFDT
     ,LAST_UPDATE_DATE
     ,DEPOSIT_TYPE
     ,AMOUNT_PCT
     ,DENSE_RANK() OVER (PARTITION BY --Add actual natural key columns here...
                           EMPLID
                         ORDER BY
                           LAST_UPDATE_DATE DESC
                        ) AS RowNum
    FROM
      PS_DIR_DEP_DISTRIB AS sdist
    WHERE
      EXISTS
        (
          -- Get the set of records that were last updated in the last 7 days.
          -- Correlate to the outer query so it only returns records related to this subset.
          -- This uses a correlated subquery. A JOIN will work, too. Try both, pick the faster one.
          -- Something like this, using the actual natural key columns in the WHERE
          SELECT
            1
          FROM
            PS_DIR_DEP_DISTRIB AS limit
          WHERE
            --The first two define the date range.
            limit.LAST_UPDATE_DATE >= DATEADD(DAY, -7, CAST(GETDATE() AS DATE))
            AND limit.LAST_UPDATE_DATE <= CAST(GETDATE() AS DATE)
            AND
            --And these are the correlations to the outer query.
            limit.EMPLID = sdist.EMPLID
            AND limit.BANK_CD = sdist.BANK_CD
            AND limit.ACCOUNT_NUM = sdist.ACCOUNT_NUM
        )
      AND
      (
        dist.DEPOSIT_TYPE = 'P'
        AND dist.AMOUNT_PCT = 100
      )
      OR dist.PRIORITY = 999
      OR dist.DEPOSIT_TYPE = 'B'
  ) AS dist
    ON
    dist.EMPLID = dd.EMPLID
      AND dist.EFFDT = dd.EFFDT
INNER JOIN
  PS_EMPLOYEES AS emp
    ON
    emp.EMPLID = dist.EMPLID
WHERE
  dist.RowNum <= 2
  AND
  emp.EMPL_STATUS NOT IN ('T', 'R', 'D')
  AND 
  dd.EFF_STATUS = 'A';