释
想象一下,这里有3家公司。我们按Name
加入表格,因为并非每位员工都提供了PersonalNo
。 StringId
只有专家,所以它也不能用于加入。同一名员工可以在多家公司工作。
问题
问题是可能有不同的员工使用相同的名称(名字和名字相同,例如只提供名字)。
我需要什么?
当数据出现任何问题时返回1
,0
如果是正确的话。
检测问题的规则
PersonalNo
并且并非所有人都有StringId
(作为Peter )时返回1
(错误)NULL
(参见John )时,但它们都具有相同的{{1它应该返回StringId
(这是正确的,这意味着其中一家公司未提供0
)PersonalNo
相等且所有PersonalNo
都相等时(参见Lisa )它应该返回StringId
(正确)0
和所有PersonalNo
它应该是这样的:我们看到这里有两个不同的Jennifer与StringId
4805250141
和Jennifer与PersonalNo
4920225088
的人Jennifer与PersonalNo
NULL
的Jennifer与Jennifer一样PersonalNo
StringId
4920225088
所以它应该返回PersonalNo
(正确),并且不应该选择0
4805250141
的Jennifer,因为PersonalNo
并且只有1行具有相同的StringID
。PersonalNo
它根本不应出现在选择中。 示例数据
StringId
渴望输出
Company Name PersonalNo StringId
Comp1 Peter 3850342515 85426 -------------------------------------------------------------------
Comp2 Peter 3850342515 '' -- If have the same PersonalNo and there is no StringId - 1 (wrong)
Comp1 John NULL 12345 ------------------------------------------------------------------
Comp2 John 3952525252 12345 -- If have the same StringId and 1 PersonalNo is NULL - 0 (correct)
Comp1 Lisa 4951212581 52124 ----------------------------------------------------------------
Comp3 Lisa 4951212581 52124 -- If PersonalNo are equal and StringId are equal - 0 (correct)
Comp1 Jennifer 4805250141 '' -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Comp1 Jennifer 4920225088 55443 -- If have 2 different PersonalNo and NULL PersonalNo, but where PersonalNo is NULL
Comp3 Jennifer NULL 55443 -- Have the same StringId with other row where is provided PersonalNo it should be 0 (correct), with different PersonalNo where is no StringId shouldn't appear at all.
Comp1 Ralph 3961212256 '' -- Shouldn't appear in select list, because only 1 row with this PersonalNo and there is no StringID
QUERY
Peter 1
John 0
Lisa 0
Jennifer 0
查询问题是我只能按LEFT JOIN (SELECT Name,
(
SELECT CASE WHEN MIN(PersonalNo) <> MAX(d.PersonalNo)
and MIN(CASE WHEN StringId IS NULL THEN '0' ELSE StringId END) <> MAX(CASE WHEN d.StringId IS NULL THEN '0' ELSE d.StringId END) -- this is wrong
and MIN(PersonalNo) <> ''
and MIN(PersonalNo) IS NOT NULL
and MAX(rn) > 1 THEN 1
ELSE 0
END AS CheckPersonalNo
FROM (
SELECT Name, PersonalNo, [StringId], ROW_NUMBER() OVER (PARTITION BY Name, PersonalNo ORDER BY Name) rn
FROM TableEmp e1
WHERE Condition = 1 and e1.Name = d.Name
) sub2
GROUP BY Name
) CheckPersonalNo
FROM [TableEmp] d
WHERE Condition = 1
GROUP BY Name
) f ON f.Name = x.Name
分组,无法将Name
添加到PersonalNo
子句,因此我需要在选择列表中使用聚合。但是现在它只比较GROUP BY
和MIN
值,如果有超过2行具有相同的名称它没有按预期工作。
我需要做一些事情,比较MAX
的值。现在,它将值与同一PARTITION BY Fullname, PersonalNo
进行比较(不依赖于Name
)。
有什么想法吗?如果您有任何问题 - 请问我,我会尽力解释。
更新1
如果有2个条目的PersonalNo
不同,但PersonalNo
相等,则应为StringId
(错误)。
1
现在它返回如下:
Company Name PersonalNo StringId
Comp1 Anna 4805250141 88552 -- different PersonalNo and the same StringId for both should go as 1 (wrong)
Comp1 Anna 4920225088 88552
应该是:
Anna 0
Anna 0
更新2
在Anna 1
列中UNION
更新后,Identifier
返回{对于下面的数据),但在这种情况下,当1个条目有StringId: 55443
时,其他为PersonalNo
},但它们都有相同的(等于)blank
它是正确的(应该是0)
StringId
答案 0 :(得分:2)
我希望我理解你的要求..
可能有其他方法可以做到这一点,但我个人可能会使用临时表进行临时工作,如果是我这样做的话。
--select data into a temp table that can be modified
select
*
into #cleaned
from
table
--apply personal numbers based on other records with matching string id
--you could take note of the records you are doing this to for data clean up
update c
set c.personalNo = s.personalNo
from #cleaned as c
inner join table as s
on c.name = s.name
and c.stringID = s.stringID
and c.personalNo is null
and s.personalNo is not null
--find all records with non matching string ids
select
name
,PersonalNo
,count(*) as numIDs
into #issues
from(
select
name
,PersonalNo
,stringID
from
#cleaned
group by
name
,PersonalNo
,stringID
) as i
group by
name
,PersonalNo
having
count(*) > 1
--select data for viewing.
select
distinct
s.name
,case
when i.name is not null then 1
else 0
end as issue
from
#cleaned as s
left outer join #issues as i
on s.name = i.name
and s.personalNo = i.personalNo
order by issue desc
SQLFiddle:http://sqlfiddle.com/#!3/f4aab/7
抱歉,如果这里有虫子,但我相信你会得到这个想法,它不是火箭科学,只是另一种方法编辑:刚刚注意到您对没有字符串ID的行感兴趣..只是如果它是唯一的行,那么它不是问题。我修改了第一个select(into #cleaned)以获取所有行。
编辑:没有临时表现在您知道它在做什么,这里没有任何临时表是一样的 - 但是警告这个更新分配缺少的personalNo的源表
update c
set c.personalNo = s.personalNo
from table1 as c
inner join table1 as s
on c.name = s.name
and c.stringID = s.stringID
and c.personalNo is null
and s.personalNo is not null
select
distinct
s.name
,case
when i.name is not null then 1
else 0
end as issue
from
table1 as s
left outer join (
select
name
,PersonalNo
,count(*) as numIDs
from(
select
name
,PersonalNo
,stringID
from
table1
group by
name
,PersonalNo
,stringID
) as i
group by
name
,PersonalNo
having
count(*) > 1
)
as i
on s.name = i.name
and s.personalNo = i.personalNo
order by issue desc
SQLFiddle:http://sqlfiddle.com/#!3/f4aab/8
PARITIONING 我不知道我将如何在这里使用分区,因为你想要做的只是知道是否有多行,我使用更复杂的制表分区或者如果我要去根据更复杂的规则对判断调用更新数据的结果进行排名..但无论如何这里是一个被禁止分区的乌鸦:D
Select
name
,personalNo
,case
when numstrings > 1 then 1
else 0 end as issue
from
(select
name
,personalNo
,row_number() over (partition by
name
,personalNo
order by
name
,personalNo
,stringID
) as numstrings
from
#cleaned
group by
name
,personalNo
,stringid) as d
order by
issue desc
注意:这使用了如上所述的#cleaned表,因为我认为使这种情况变得困难的关键在于有时候缺少个人名称。
没有临时表,没有更新
在上面使用它显然可以不使用任何临时表或更新任何东西,它只是一个可读性/可维护性的问题,以及它是否实际上更快。这可以更稳定地处理具有多个personalNo分配的字符串ID:
select
distinct
s.name
,case
when i.name is not null then 1
else 0
end as issue
from
table1 as s
left outer join (
select
name
,PersonalNo
,count(*) as numIDs
from(
select
a.name
,coalesce(a.PersonalNo,b.PersonalNo) as PersonalNo
,a.stringID
from
table1 as a
left outer join table1 as b
on a.name = b.name
and a.stringid=b.stringid
and a.personalNo != b.personalNo
and b.personalNo Is Not Null
group by
a.name
,a.PersonalNo
,a.stringID
,b.PersonalNo
) as i
group by
name
,PersonalNo
having
count(*) > 1
)
as i
on s.name = i.name
and s.personalNo = i.personalNo
order by issue desc
SQLFiddle:http://sqlfiddle.com/#!3/f4aab/9
编辑:寻找不一致的个人数字 - 这会使用一个临时表,但您可以像上一个示例中所做的那样将其交换出来。注意与您提出的原始结构略有不同因为这更像是我将如何完成这项任务,但是这里有足够的代码可以让你以任何你想要的方式重新开始。
--select data into a temp table that can be modified
select
*
into #cleaned
from
table1
--apply personal numbers based on other records with matching string id
--you could take note of the records you are doing this to for data clean up
update c
set c.personalNo = s.personalNo
from #cleaned as c
inner join table1 as s
on c.name = s.name
and c.stringID = s.stringID
and c.personalNo is null
and s.personalNo is not null
Select
IssueType
,Name
,Identifier
from
(
--find all records with non matching PersonalNos
select
name
,cast('StringID: ' + stringID as nvarchar(400)) as Identifier
,cast('Inconsistent PersonalNo' as nvarchar(400)) as issueType
from(
select
name
,PersonalNo
,stringID
from
#cleaned
group by
name
,PersonalNo
,stringID
) as i
group by
name
,StringId
having
count(*) > 1
UNION
--find all records with non matching string ids
select
name
,'PersonalNo: ' + PersonalNo
,cast('Inconsistent String ID' as nvarchar(400)) as issueType
from(
select
name
,PersonalNo
,stringID
from
#cleaned
group by
name
,PersonalNo
,stringID
) as i
group by
name
,PersonalNo
having
count(*) > 1
) as a
SQLFiddle:http://sqlfiddle.com/#!3/e9da2/18
更新:还想接受空字符串personalNo的 这是另一个新要求..接受空字符串的方式与personalNo
中的NULL相同--select data into a temp table that can be modified
select
*
into #cleaned
from
table1
--apply personal numbers based on other records with matching string id
--you could take note of the records you are doing this to for data clean up
update c
set c.personalNo = s.personalNo
from #cleaned as c
inner join table1 as s
on c.name = s.name
and c.stringID = s.stringID
and (c.personalNo IS NULL OR c.personalNo ='')
and s.personalNo is not null
and s.personalNo != ''
Select
IssueType
,Name
,Identifier
from
(
--find all records with non matching PersonalNos
select
name
,cast('StringID: ' + stringID as nvarchar(400)) as Identifier
,cast('Inconsistent PersonalNo' as nvarchar(400)) as issueType
from(
select
name
,PersonalNo
,stringID
from
#cleaned
group by
name
,PersonalNo
,stringID
) as i
group by
name
,StringId
having
count(*) > 1
UNION
--find all records with non matching string ids
select
name
,'PersonalNo: ' + PersonalNo
,cast('Inconsistent String ID' as nvarchar(400)) as issueType
from(
select
name
,PersonalNo
,stringID
from
#cleaned
group by
name
,PersonalNo
,stringID
) as i
group by
name
,PersonalNo
having
count(*) > 1
) as a
SQLFiddle:http://sqlfiddle.com/#!3/412127/8