SQL:根据自定义条件

时间:2017-10-04 16:01:53

标签: sql sql-server

我需要根据两个表并根据自定义条件查找重复项。以下内容确定它是否重复,如果是,则仅显示最新的:

如果员工姓名和所有EmployeePolicy CoverageId(s)完全匹配另一条记录,那么这被视为重复。

--Employee Table
EmployeeId  Name  Salary
543         John  54000
785         Alex  63000
435         John  75000
123         Alex  88000
333         John  67000

--EmployeePolicy Table
EmployeePolicyId  EmployeeId  CoverageId
1                 543         8888
2                 543         7777
3                 785         5555
4                 435         8888
5                 435         7777
6                 123         4444
7                 333         8888
8                 333         7776

例如,上例中的重复项如下:

EmployeeId Name Salary
543        John 54000
435        John 75000

这是因为它们是Employee表中唯一具有匹配名称的,并且两者在EmployeePolicy表中具有相同的CoverageIds。

注意: EmployeeId 333也与Name = John不匹配,因为他的两个CoverageID与其他John的CoverageId不同。

起初,我一直试图通过分组记录并说计数(*)>来找到重复的旧式方法。 1,但后来很快意识到它不会起作用,因为在英语中我的标准定义了重复,在SQL中CoverageID是不同的,因此它们不被视为重复。

通过同样的协议,我尝试了类似的事情:

-- Create a TMP table

INSERT INTO #tmp
SELECT *
FROM Employee e join EmployeePolicy ep on e.EmpoyeeId = ep.EmployeeId

SELECT info.* 
FROM  
(
    SELECT 
        tmp.*, 
        ROW_NUMBER() OVER(PARTITION BY tmp.Name, tmp.CoverageId ORDER BY tmp.EmployeeId DESC) AS RowNum
    FROM #tmp tmp  
) info  
WHERE 
    info.RowNum = 1 AND 

同样,这不起作用,因为SQL不会将其视为重复项。不确定如何将我的重复英文定义翻译成重复的SQL定义。

非常感谢任何帮助。

2 个答案:

答案 0 :(得分:3)

最简单的方法是将策略连接成一个字符串。唉,这在SQL Server中很麻烦。这是一种基于集合的方法:

with ep as (
      select ep.*, count(*) over (partition by employeeid) as cnt
      from employeepolicy ep
     )
select ep.employeeid, ep2.employeeid
from ep join
     ep ep2
     on ep.employeeid < ep2.employeeid and
        ep.CoverageId = ep2.CoverageId and
        ep.cnt = ep2.cnt
group by ep.employeeid, ep2.employeeid, ep.cnt
having count(*) = cnt   -- all match

我们的想法是匹配不同员工的承保范围。一个简单的标准是覆盖范围需要匹配。然后,它检查匹配的覆盖数是否是实际数量。

注意:这会将员工ID对放在一行中。您可以加入employees表以获取其他信息。

答案 1 :(得分:0)

我没有测试过T-SQL,但我相信以下内容可以为您提供所需的输出。

;WITH CTE_Employee
AS
(
    SELECT       E.[Name]
                ,E.[EmployeeId]
                ,P.[CoverageId]
                ,E.[Salary]
    FROM        Employee E
    INNER JOIN  EmployeePolicy P ON E.EmployeeId = P.EmployeeId
)
, CTE_DuplicateCoverage
AS
(
    SELECT       E.[Name]
                ,E.[CoverageId]
    FROM        CTE_Employee E
    GROUP BY    E.[Name], E.[CoverageId]
    HAVING      COUNT(*) > 1
)
SELECT      E.[EmployeeId]
            ,E.[Name]
            ,MAX(E.[Salary]) AS [Salary]
FROM        CTE_Employee E
INNER JOIN  CTE_DuplicateCoverage D ON E.[Name] = D.[Name] AND E.[CoverageId] = D.[CoverageId]
GROUP BY    E.[EmployeeId], E.[Name]
HAVING      COUNT(*) > 1
ORDER BY    E.[EmployeeId]