如何加入另一个表并仅返回最近匹配的行?

时间:2016-07-20 15:51:27

标签: sql sql-server sql-server-2012

我有一张表存储合同中的行。每个合同都有自己的唯一ID,它也有其父合同的ID。例如:

+-------------+---------+
| contract_id | line_id |
+-------------+---------+
|        1111 |     100 |
|        1111 |     101 |
|        1111 |     102 |
+-------------+---------+

我有另一张表存储合同行的历史更改。例如,每次更改合同行上的单位数时,都会向表中添加一个新行。例如:

+-------------+---------+--------------+-------+
| contract_id | line_id | date_changed | units |
+-------------+---------+--------------+-------+
|        1111 |     100 | 2016-01-01   |     1 |
|        1111 |     100 | 2016-02-01   |     2 |
|        1111 |     100 | 2016-03-01   |     3 |
+-------------+---------+--------------+-------+

如您所见,属于ID为1111的合同的ID为100的合同行已在3个月内编辑了3次。当前值为3个单位。

我正在针对合约行表运行查询以选择所有数据。我想加入历史数据表并为每个合约行选择最近的行,并在结果中显示单位。我该怎么做?

预期结果(101和102也会有单一结果):

+-------------+---------+-------+
| contract_id | line_id | units |
+-------------+---------+-------+
|        1111 |     100 |     3 |
+-------------+---------+-------+

我已尝试使用左连接进行下面的查询,但它返回3行而不是1。

查询:

SELECT *, T1.units
FROM contract_lines
LEFT JOIN (
    SELECT contract_id, line_id, units, MAX(date_changed) AS maxdate
    FROM contract_history
    GROUP BY contract_id, line_id, units) AS T1
    ON contract_lines.contract_id = T1.contract_id 
    AND contract_lines.line_id = T1.line_id

实际结果:

+-------------+---------+-------+
| contract_id | line_id | units |
+-------------+---------+-------+
|        1111 |     100 |     1 |
|        1111 |     100 |     2 |
|        1111 |     100 |     3 |
+-------------+---------+-------+

5 个答案:

答案 0 :(得分:3)

对contract_history和maxdate的额外加入将起作用

SELECT contract_lines.*,T2.units
FROM contract_lines
LEFT JOIN (
    SELECT contract_id, line_id, MAX(date_changed) AS maxdate
    FROM contract_history
    GROUP BY contract_id, line_id) AS T1 
    JOIN contract_history T2 ON 
         T1.contract_id=T2.contract_id and 
         T1.line_id= T2.line_id and 
         T1.maxdate=T2.date_changed
ON contract_lines.contract_id = T1.contract_id
AND contract_lines.line_id = T1.line_id

<强>输出

enter image description here

答案 1 :(得分:1)

这是我的首选风格,因为它不需要自我加入并且干净地表达您的意图。此外,它在性能方面与ROW_NUMBER()方法竞争非常好。

select a.*
     , b.units
from contract_lines as a
join (
    select a.contract_id
         , a.line_id
         , a.units
         , Max(a.date_changed) over(partition by a.contract_id, a.line_id) as max_date_changed
    from contract_history as a
) as b
    on a.contract_id = b.contract_id
   and a.line_id = b.line_id
   and b.date_changed = b.max_date_changed;

答案 2 :(得分:0)

另一种可能的解决方案。这使用RANK对此进行排序/过滤。与你所做的类似,只是一个不同的机智。

SELECT contract_lines.*, T1.units
FROM contract_lines
LEFT JOIN (
    SELECT contract_id, line_id, units,
    RANK() OVER (PARTITION BY contract_id, line_id ORDER BY date_changed DESC) AS [rank]
    FROM contract_history) AS T1
ON contract_lines.contract_id = T1.contract_id 
AND contract_lines.line_id = T1.line_id
AND T1.rank = 1
WHERE T1.units IS NOT NULL

如果您希望数据始终存在,您可以将其更改为INNER JOIN并删除IS NOT NULL子句中的WHERE

很高兴你明白了!

答案 3 :(得分:0)

尝试这个简单的查询:

SELECT TOP 1 T1.*
FROM contract_lines T0 
    INNER JOIN contract_history T1 
        ON T0.contract_id = T1.contract_id and 
            T0.line_id = T1.line_id 
ORDER BY date_changed DESC

答案 4 :(得分:-1)

在花了一个小时看着它并且在StackOverflow上大肆宣传维护这段罕见的维护时间之后,似乎总是这样,我在发布问题后不久解决了我自己的问题。

为了帮助其他任何被困的人,我会展示我发现的东西。这可能不是一种有效的方法来实现这一目标,所以如果有人有更好的建议,我会全神贯顺。

我从这里调整了答案:T-SQL Subquery Max(Date) and Joins

SELECT *,
       Units = (SELECT TOP 1 units
                FROM contract_history
                WHERE contract_lines.contract_id = contract_history.contract_id
                AND contract_lines.line_id = contract_history.line_id
                ORDER BY date_changed DESC
                )
FROM ....