我正在研究雇主的数据,根据他们的名字找出重复的雇主。
数据是这样的:
Employer ID | Legal Name | Operating Name
------------- | ---------------| --------------------
1 | AA | AA
2 | BB | AA
3 | CC | BB
4 | DD | DD
5 | ZZ | ZZ
现在,如果我尝试查找雇主AA的所有重复项,则查询应返回以下结果:
Employer ID | Legal Name | Operating Name
------------- | ---------------| --------------------
1 | AA | AA
2 | BB | AA
3 | CC | BB
雇主1的法定名称和雇主2的经营名称与搜索直接匹配。 但是捕获的是雇主3,它与搜索字符串没有直接关系,但雇主2的法定名称与雇主3的经营名称相匹配。
我需要搜索结果达到第n级。我不确定是否可以通过递归查询这样的东西来实现。
请帮忙
我试图通过递归CTE实现这一点但后来我意识到它将进入无限递归。这是代码:
DECLARE @SearchName VARCHAR(50)
SET @SearchName = 'AA'
;With CTE_EmployerNames
AS
(
-- Anchor Member definition
select *
from [dbo].[Name_Table]
where Leg_Name = @SearchName
OR Op_Name = @SearchName
UNION ALL
-- Recursive Member definition
select N.*
from [dbo].[Name_Table] N
JOIN CTE_EmployerNames C
ON N.ID <> C.ID
AND (N.Leg_Name = C.Leg_Name
OR N.Leg_Name = C.Op_Name
OR N.Op_Name = C.Leg_Name
OR N.Op_Name = C.Op_Name)
)
select *
from CTE_EmployerNames
更新: 我创建了一个存储过程来实现我想要的。但是由于循环和游标,这个过程有点慢。截至目前,这通过执行时间的微小妥协解决了我的问题。任何建议,以优化它或其他方式来做到这一点将受到高度赞赏。多谢你们。这是代码:
CREATE PROCEDURE [dbo].[Get_Similar_Name_Employers]
@P_BaseName VARCHAR(100)
AS
BEGIN
DECLARE @ID INT
DECLARE @Leg_Name VARCHAR(50)
DECLARE @Op_Name VARCHAR(50)
-- Create temp table to hold data temporarily
CREATE TABLE #Temp_Employers
(
[ID] [int] NULL,
[Leg_Name] [varchar](50) NULL,
[Op_Name] [varchar](50) NULL,
[Status] [bit] null -- To keep track if that record is processed or not
)
-- Insert all records which are directly matching with search criteria
INSERT INTO #Temp_Employers
SELECT NT.ID, NT.Leg_Name, NT.Op_Name, 0
FROM dbo.Name_Table NT
WHERE NT.Leg_Name = @P_BaseName
OR NT.Op_Name = @P_BaseName
while EXISTS (SELECT 1 from #Temp_Employers where Status = 0) -- until all rows are processed
BEGIN
DECLARE @EmployerCursor CURSOR
SET @EmployerCursor = CURSOR FAST_FORWARD
FOR
SELECT ID, Leg_Name, Op_Name
from #Temp_Employers
where Status = 0
OPEN @EmployerCursor
FETCH NEXT
FROM @EmployerCursor
INTO @ID, @Leg_Name, @Op_Name
WHILE @@FETCH_STATUS = 0
BEGIN
-- For every unprocessed record in temp table check if there is any possible duplicate.
-- and insert all possible duplicate records in same table for further processing to find their possible duplicates
INSERT INTO #Temp_Employers
select ID, Leg_Name, Op_Name, 0
from dbo.Name_Table
WHERE (Leg_Name = @Leg_Name
OR Op_Name = @Op_Name
OR Leg_Name = @Op_Name
OR Op_Name = @Leg_Name)
AND ID NOT IN ( select ID
FROM #Temp_Employers)
-- Update status of recently processed record to avoid processing again
UPDATE #Temp_Employers
SET Status = 1
WHERE ID = @ID
FETCH NEXT
FROM @EmployerCursor
INTO @ID, @Leg_Name, @Op_Name
END
-- close cursor and deallocate memory
CLOSE @EmployerCursor
DEALLOCATE @EmployerCursor
END
select ID,
Leg_Name,
Op_Name
from #Temp_Employers
Order By ID
DROP TABLE #Temp_Employers
END
答案 0 :(得分:0)
您可以使用两个自联接来执行此操作。我使用DISTINCT
是安全的 - 您不需要它作为您的示例,但可能会用于您的实际数据:
SELECT DISTINCT T2.EMPID, T2.LEGAL_NAME, T.LEGAL_NAME
FROM TABLE T
INNER JOIN TABLE T2 ON T.LEGAL_NAME = T2.OPERATING_NAME
INNER JOIN TABLE T3 ON T2.OPERATING_NAME = T3.OPERATING_NAME
WHERE T.LEGAL_NAME <> T3.LEGAL_NAME
根据需要重命名和别名表格和列。
编辑 - 如果您还想要操作名称与法定名称完全不同的记录,UNION
位于:
SELECT DISTINCT T2.EMPID, T2.LEGAL_NAME, T.LEGAL_NAME
FROM TABLE T
INNER JOIN TABLE T2 ON T.LEGAL_NAME = T2.OPERATING_NAME
INNER JOIN TABLE T3 ON T2.OPERATING_NAME = T3.OPERATING_NAME
WHERE T.LEGAL_NAME <> T3.LEGAL_NAME
UNION
SELECT EMPID, LEGAL_NAME, OP_NAME
FROM TABLE
WHERE LEGAL_NAME <> OP_NAME
答案 1 :(得分:0)
您基本上是在尝试构建directed acyclic graph,其中节点是名称,并且您希望找到通向您的员工的所有名称。
Oracle Tip: Solving directed graph problems with SQL, part 1 上有一个开始教程, Directed graph SQL 上有一个相关的StackOverflow问题。