如何限制自我加入SQL Server中的前1名

时间:2013-07-24 16:08:18

标签: sql sql-server join self-join

我需要执行一个可能导致多行的自连接,但我需要将连接限制为每个记录一行。当多行与连接条件匹配时,只应使用具有最大PK的值行。这是一个简化的模式,假设:

CREATE TABLE #Records(
    Id int NOT NULL,
    GroupId int NOT NULL,
    Node varchar(10) NOT NULL,
    Value varchar(10) NULL,
    Meta1 varchar(10) NULL,
    Meta2 varchar(10) NULL,
    Meta3 varchar(10) NULL
)

以下是一些示例插页:

INSERT INTO #Records VALUES(1,123,'Parent', '888', 'meta1', 'meta2', 'meta3')
INSERT INTO #Records VALUES(2,123,'Guardian', '789', 'meta1', 'meta2', 'meta3')
INSERT INTO #Records VALUES(3,123,'Parent', '999', 'meta1', 'meta2', 'meta3')
INSERT INTO #Records VALUES(4,123,'Guardian', '654', 'meta1', 'meta2', 'meta3')
INSERT INTO #Records VALUES(5,123,'Sibling', '222', 'meta1', 'meta2', 'meta3')
INSERT INTO #Records VALUES(6,456,'Parent', '777', 'meta1', 'meta2', 'meta3')
INSERT INTO #Records VALUES(7,456,'Guardian', '333', 'meta1', 'meta2', 'meta3')

在通用术语中,我希望返回的行数等于表中的数字或记录。我需要一个Guardian列中的Parent列。对于匹配的GroupId,Parent应该等于具有节点“Parent”的Id的最新行。我需要同样的Guardian,但Node应该是'Guardian'。结果如下:

Id   GroupId    Node       Value    Meta1   Meta2   Meta3   Parent  Guardian
--- ---------- --------- --------- ------- ------- ------- ------- ----------
1     123       Parent     888      meta1   meta2   meta3    999     654     
2     123       Guardian   654      meta1   meta2   meta3    999     654
3     123       Parent     999      meta1   meta2   meta3    999     654
4     123       Guardian   789      meta1   meta2   meta3    999     654
5     123       Sibling    222      meta1   meta2   meta3    999     654
6     456       Parent     777      meta1   meta2   meta3    777     333
7     456       Guardian   333      meta1   meta2   meta3    777     333

注意,我现在部分工作,但不限制最新值。当所有父节点和监护人值节点具有相同值时,它工作正常。我试图限制到MAX,但都失败了。查看此查询可能会影响您的判断,因此请不要犹豫,将其完全抛弃。

SELECT #Records.*, Parent,Guardian
FROM #Records
LEFT JOIN (
    SELECT MAX(Id) As ParentRow, GroupId, Value AS Parent
    FROM #Records
    WHERE Node = 'Parent'
    GROUP BY GroupId, Value
) AS Parents
ON #Records.GroupId = Parents.GroupId
LEFT JOIN (
    SELECT MAX(Id) As ParentRow, GroupId, Value AS Guardian
    FROM #Records
    WHERE Node = 'Guardian'
    GROUP BY GroupId, Value
) AS Guardians
ON #Records.GroupId = Guardians.GroupId

提前致谢!

2 个答案:

答案 0 :(得分:2)

你很接近,但是你从子查询中返回了太多结果,因为你在多个字段上进行分组,因为你想要每个GroupID的最大值(id),你可以使用ROW_NUMBER()函数来实现这个目标:

SELECT DISTINCT #Records.*, Parent,Guardian
FROM #Records
LEFT JOIN ( SELECT GroupID,Value'Parent',ROW_NUMBER() OVER(PARTITION BY GroupID ORDER BY ID DESC)'RowRank'
            FROM #Records
            WHERE Node = 'Parent'
           ) AS Parents
    ON #Records.GroupId = Parents.GroupId
      AND Parents.RowRank = 1
LEFT JOIN ( SELECT GroupID,Value'Guardian',ROW_NUMBER() OVER(PARTITION BY GroupID ORDER BY ID DESC)'RowRank'
            FROM #Records
            WHERE Node = 'Guardian'
          ) AS Guardians
    ON #Records.GroupId = Guardians.GroupId
      AND Guardians.RowRank = 1

答案 1 :(得分:1)

以下内容为原始表格中的每一行获取组中最近的上一个父/监护人。它使用select子句中的相关子查询来完成工作:

select r.*,
       (select top 1 Value
        from #Records r2
        where r2.GroupId = r.GroupId and Node = 'Parent'
        order by id desc
       ) parent,
       (select top 1 Value
        from #Records r2
        where r2.GroupId = r.GroupId and Node = 'Guardian' 
        order by id desc
       ) guardian
from #Records r;

使用嵌套选择可确保原始表中的所有行只包含一次。

在某些数据库中,您可以使用窗口/分析函数执行此操作。例如,以下是Oracle语法:

select r.*,
       Last(case when Node = 'Parent' then Value end) over (partition by GroupId order by id) as Parent,
       Last(case when Node = 'Parent' then Value end) over (partition by GroupId order by id) as Guardian
from #Records;