我在SQL表中有以下数据:
我需要查询数据,以便每个员工都能找到一个缺少“ familyid ”的列表。
例如,我应该找到ID:2和5以及员工1027中缺少的员工1021,其中缺少数字1和6。
有关如何查询的任何线索?
感谢任何帮助。
答案 0 :(得分:3)
找到第一个缺失值
我会使用ROW_NUMBER
window function来指定“正确的”序列ID号。假设每次员工ID更改时序列ID都会重新启动:
SELECT
e.id,
e.name,
e.employee_number,
e.relation,
e.familyid,
ROW_NUMBER() OVER(PARTITION BY e.employeeid ORDER BY familyid) - 1 AS sequenceid
FROM employee_members e
然后,我会过滤结果集,只包含序列ID不匹配的行:
SELECT *
FROM (
SELECT
e.id,
e.name,
e.employee_number,
e.relation,
e.familyid,
ROW_NUMBER() OVER(PARTITION BY e.employeeid ORDER BY familyid) - 1 AS sequenceid
FROM employee_members e
) a
WHERE a.familyid <> a.sequenceid
然后,您应该轻松按employee_number
进行分组,并找到每个员工的第一个缺失的序列ID:
SELECT
a.employee_number,
MIN(a.sequence_id) AS first_missing
FROM (
SELECT
e.id,
e.name,
e.employee_number,
e.relation,
e.familyid,
ROW_NUMBER() OVER(PARTITION BY e.employeeid ORDER BY familyid) - 1 AS sequenceid
FROM employee_members e
) a
WHERE a.familyid <> a.sequenceid
GROUP BY a.employee_number
查找所有缺失值
扩展上一个查询,我们可以在每次familyid
和sequenceid
之间的差异发生变化时检测到缺失值:
-- Warning: this is totally untested :-/
SELECT
b.employee_number,
MIN(b.sequence_id) AS missing
FROM (
SELECT
a.*,
a.familyid - a.sequenceid AS displacement
SELECT
e.*,
ROW_NUMBER() OVER(PARTITION BY e.employeeid ORDER BY familyid) - 1 AS sequenceid
FROM employee_members e
) a
) b
WHERE b.displacement <> 0
GROUP BY
b.employee_number,
b.displacement
答案 1 :(得分:3)
这是一种方法。计算每个员工的最大家庭ID。然后将其加入到最大家庭ID的数字列表中。结果每个员工和预期的家庭ID都有一行。
从此left outer join
返回原始数据,familyid
和号码。如果没有匹配,那就是缺失值:
with nums as (
select 1 as n
union all
select n+1
from nums
where n < 20
)
select en.employee, n.n as MissingFamilyId
from (select employee, min(familyid) as minfi, max(familyid) as maxfi
from t
group by employee
) en join
nums n
on n.n <= maxfi left outer join
t
on t.employee = en.employee and
t.familyid = n.n
where t.employee_number is null;
请注意,当缺少的familyid
是序列中的最后一个数字时,这将不起作用。但是,对数据结构可能是最好的。
此外,上述查询假设最多有20个家庭成员。
答案 2 :(得分:2)
这将有效,您选择所有“Dependents”并在前一行左连接。如果那行不存在,则显示结果:
SELECT 'Missing Prior', t1.*
FROM employee_members t1
LEFT JOIN employee_members t2 ON t1.employee_number = t2.employee_number
AND (t1.familyid-1) = t2.familyid
WHERE t2.employee_number is null and t1.relation == 'Dependent'
另一个版本显示缺少的数字:
SELECT t1.employee_number, t1.familyid-1 as Missing_Member
FROM employee_members t1
LEFT JOIN employee_members t2 ON t1.employee_number = t2.employee_number
AND (t1.familyid-1) = t2.familyid
WHERE t2.employee_number is null and t1.relation == 'Dependent'
答案 3 :(得分:1)
另一个解决方案: 构建一个包含序列中所有可能值的表(可以为此使用标识)。然后在源表为null的表上保持连接。
DECLARE @Seq TABLE (id INT IDENTITY(1, 1))
DECLARE @iter INT = 1
WHILE @iter <= (
SELECT MAX([your ID column])
FROM [Offending Table]
)
BEGIN
INSERT @Seq DEFAULT
VALUES
SET @iter = @iter + 1
END
SELECT id
FROM @seq s
LEFT JOIN [Offending Table] ot ON s.id = ot.[your ID column]
WHERE ot.[Offending Table]IS NULL
答案 4 :(得分:0)
此选择将使用CTE方法检索每位员工缺少的“familyid”列表。
QUERY:
WITH emp_grp (
EmployeeID
,MaxFamilyID
)
AS (
SELECT e2.EmployeeID
,MAX(e2.FamilyID) MaxFamilyID
FROM employee_number e2
GROUP BY e2.EmployeeID
)
,emp_mem
AS (
SELECT EmployeeID
,0 AS FamilyID
,MaxFamilyID
FROM emp_grp
UNION ALL
SELECT EmployeeID
,FamilyID + 1 AS FamilyID
,MaxFamilyID
FROM emp_mem
WHERE emp_mem.FamilyID < MaxFamilyID
)
SELECT emp_mem.EmployeeID
,emp_mem.FamilyID
FROM emp_mem
LEFT JOIN employee_number emp_num ON emp_mem.EmployeeID = emp_num.EmployeeID
AND emp_mem.FamilyID = emp_num.FamilyID
WHERE emp_num.EmployeeID IS NULL
ORDER BY emp_mem.EmployeeID
,emp_mem.FamilyID
OPTION ( MAXRECURSION 32767)
输出:
EmployeeID FamilyID
----------- -----------
1021 2
1021 5
1027 1
1027 6