只选择两个表中的匹配记录

时间:2011-08-19 00:00:38

标签: sql oracle

我正在使用Oracle 10g 这是我的情景:

我有两张桌子

class1(groupName, subgroup)
class2(groupName, subgroup, ind)

以下是我的数据:

class1
groupName  subgroup
     A      1
     A      2
     B      3
     C      4
     C      4
     C      5
     D      6


class2
groupName  subgroup IND
     A      1        Y     
     A      1        N
     A      2        Y
     A      2        N
     B      3        Y
     C      4        Y
     C      4        N

现在,我需要在class1和class2中获取具有匹配的groupName和subGroup的数据(不一定是不同的匹配)。除此之外,IND列应该具有第2类中每个子组的“Y”和“N”值对。例如,在上面的示例中,GroupName A是合格的,因为A存在于class1和class2中,并且它具有子组1和2在class1和class2中都存在,而class2表中的IND列对于每个子组(即1和2)都有一对'Y'和'N'。

其他记录不合格,因为:组B的子组3同时存在于class1和class2中,但它在class2中没有子组3的“Y”和“N”对。 C组和D组不合格,因为它的所有子组(4,5)在class2中都不存在。

我在table1和class2上都有超过700,000条记录。任何人都知道获取此信息的有效方法是什么。

3 个答案:

答案 0 :(得分:2)

这会创造您所需要的吗?

SELECT *
FROM class1 c1
JOIN class2 c2 ON c1.groupName = c2.groupName
        AND c1.subgroup = c2.subgroup
WHERE
    (
    SELECT COUNT(DISTINCT ind)
    FROM class2 c2a
    WHERE c2a.groupName = c1.groupName
        AND c2a.subgroup = c2a.subgroup
    ) = 2
  AND
    (
    SELECT COUNT(DISTINCT subgroup)
    FROM class1 c1b
    WHERE c1b.groupName = c1.groupName
    ) =
    (
    SELECT COUNT(DISTINCT subgroup)
    FROM class2 c2b
    WHERE c2b.groupName = c2.groupName
    )

答案 1 :(得分:0)

如果IND列具体为Y和N并不重要,您可以这样做:

select t1.groupName from
( select count(class1.groupName) a, groupName from class1 group By groupName) t1
 inner join 
( select count(class2.groupName) a, groupName from class2 group by groupName) t2
on t1.groupName = t2.groupName and 2*t1.a = t2.a

如果确实很重要,你可以像这样修改第二个内部查询:

select count(class2.groupName) a, groupName from class2 group by groupName
  having  max(ind) = 'Y' and min(ind) = 'N'

编辑以支持评论中提到的其他测试:

select distinct t1.groupName from
( select count(class1.groupName) a, groupName, subgroup from class1 
  group By groupName, subgroup) t1
 inner join 
( select count(class2.groupName) a, groupName, subgroup from class2 
  group by groupName, subgroup
  having  max(ind) = 'Y' and min(ind) = 'N') t2
on t1.groupName = t2.groupName and t1.subgroup = t2.subgroup and 2*t1.a = t2.a

答案 2 :(得分:0)

这样的事情必须有所帮助......

select 
  groupName
from (

  select -- Get number of good subgroups for each group
    groupName                      as groupName,
    subGroupCount                  as subGroupCount,
    sum( decode(ynCount, 2,1, 0) ) as goodGroupCount
  from (

      select -- Find which subgroups are good (contains both Y and N)
        c1set.groupName        as groupName,
        c1set.subGroup         as subGroup,
        c1set.subGroupCount    as subGroupCount,
        count(distinct c2.IND) as ynCount
      from
        (
          select -- Collect group/subgroup sets and get number of subGroups
            distinct 
              c1.groupName                        as groupName,
              c1.subGroup                         as subGroup,
              count(distinct c1.subGroup)
                over (parttition by c1.groupName) as subGroupCount
          from
            class1 c1
        )
               c1set,
        class2 c2
      where
        c2.groupName (+) = c1set.groupName
        and
        c2.subGroup (+) = c1set.subGroup
      group by
        c1set.groupName,
        c1set.subGroup,
        c1set.subGroupCount

    )
  group by
    groupName, subGroupCount

)
where
  subGroupCount = goodGroupCount

抱歉,我现在无法自己测试此代码。如果您发现任何不一致,请发表评论。