选择与类似数据集关联的对象

时间:2016-05-19 21:07:40

标签: sql sql-server tsql

我试图从[公司]表中选择与至少一个其他公司共享的所有公司行,相同数量的员工(来自具有CompanyId列的[Employee]表),其中每个各个员工组共享同一组LocationIds([Employee]表中的一列)并按相同比例共享。

因此,例如,此查询将选择两家拥有locationIds 1,2和2的三名员工的公司。

[Employee]

 EmployeeId |  CompanyId  | LocationId | 
========================================
  1         |  1          |  1      
  2         |  1          |  2 
  3         |  1          |  2
  4         |  2          |  1 
  5         |  2          |  2 
  6         |  2          |  2 
  7         |  3          |  3



[Company]

 CompanyId | 
============
  1   |     
  2   |    
  3   |


  Returns the CompanyIds:
  ======================
  1
  2

选择CompanyIds 1和2是因为它们与至少另一家公司有共同之处:1。员工人数(3名员工); 2.与这些员工关联的LocationIds的数量/比例(1名员工具有LocationId 1和2名员工具有LocationId 2)。

到目前为止,我想我想使用HAVING COUNT(?) > 1声明,但我在解决细节方面遇到了麻烦。有没有人有什么建议?

2 个答案:

答案 0 :(得分:1)

这很难看,但我能想到的唯一方法就是这样做:

;with CTE as (
    select c.Id,
        (
            select e.Location, count(e.Id) [EmployeeCount]
            from Employee e
            where e.IdCompany=c.Id
            group by e.Location
            order by e.Location
            for xml auto
        ) LocationEmployeeData
    from Company c
)
select c.Id
from Company c
join (
    select x.LocationEmployeeData, count(x.Id) [CompanyCount]
    from CTE x
    group by x.LocationEmployeeData
    having count(x.Id) >= 2
) y on y.LocationEmployeeData = (select LocationEmployeeData from CTE where Id = c.Id)

请参阅小提琴:http://www.sqlfiddle.com/#!6/6bc16/5

它的工作原理是将每个位置数据的Employee数量(多行)编码为每个公司的xml字符串。

CTE代码:

select c.Id,
    (
        select e.Location, count(e.Id) [EmployeeCount]
        from Employee e
        where e.IdCompany=c.Id
        group by e.Location
        order by e.Location
        for xml auto
    ) LocationEmployeeData
from Company c

生成如下数据:

Id  LocationEmployeeData
1   <e Location="1" EmployeeCount="2"/><e Location="2" EmployeeCount="1"/>
2   <e Location="1" EmployeeCount="2"/><e Location="2" EmployeeCount="1"/>
3   <e Location="3" EmployeeCount="1"/>

然后它会根据这个字符串对公司进行比较(而不是试图确定多行是否匹配等)。

答案 1 :(得分:1)

替代解决方案可能如下所示。但是,它还需要提前进行性能测试(我对<>类型连接不太有信心。)

with List as
(
  select
    IdCompany,
    Location,
    row_number() over (partition by IdCompany order by Location) as RowId,
    count(1) over (partition by IdCompany) as LocCount
  from
    Employee
) 
select
  A.IdCompany
from List as A
  inner join List as B on A.IdCompany <> B.IdCompany
  and A.RowID = B.RowID
  and A.LocCount = B.LocCount
group by
  A.IdCompany, A.LocCount
having
  sum(case when A.Location = B.Location then 1 else 0 end) = A.LocCount

相关小提琴:http://sqlfiddle.com/#!6/d9f2e/1