如何查找特定分组中不完整的属性?

时间:2016-11-04 13:03:33

标签: sql database oracle

我希望识别一个属性的所有实例,其中值在组内不相等。例如:

+------------+---------+------------+
|group_id    |  area   |     pip    |
+------------+---------+------------+
|  432       |  23     |  jack      |
|  432       |  23     |  jack      |
|  745       |  45     |  bill      |
|  745       |  45     |  bill      |
|  848       |  67     |  lynn      |
|  848       |  65     |  lynn      |
|  23        |  33     |  hop       |
|  23        |  33     |  hope      |
|  670       |  893    |  sue       |
|  670       |  893    |  sue       |
+------------+---------+------------+

我需要知道的是,组ID中的所有属性都不同。所以输出表应该是:

+------------+---------+------------+
|group_id    |  area   |     pip    |
+------------+---------+------------+
|  848       |  67     |  lynn      |
|  848       |  65     |  lynn      |
|  23        |  33     |  hop       |
|  23        |  33     |  hope      |
+------------+---------+------------+

这是输出,因为group_id 848的区域不同,组23的pip拼写错误。我还想返回所有领域(因为我有超过3个我正在看)。谢谢。

5 个答案:

答案 0 :(得分:1)

在此解决方案中,为了支持其他属性,您唯一需要做的就是将它们添加到此部分 -
order by area,pip,...

select      *

from       (select      t.*
                       ,min (rnk) over (partition by group_id)  as min_rnk
                       ,max (rnk) over (partition by group_id)  as max_rnk

            from       (select      t.*
                                   ,rank () over (partition by group_id order by area,pip)  as rnk

                        from        t
                        ) t
            ) t

where       min_rnk <> max_rnk
;

select      *

from       (select      t.*
                       ,count (distinct rnk) over (partition by group_id)  as distinct_rnk

            from       (select      t.*
                                   ,rank () over (partition by group_id order by area,pip)  as rnk

                        from        t
                        ) t
            ) t

where       distinct_rnk > 1
;

count (distinct rnk)更自然地理解,但可能涉及显着的性能损失

答案 1 :(得分:1)

使用窗口函数获取所需的汇总信息:

select group_id, area, pip, colx, coly, colz, ...
from
(
  select 
    mytable.*,
    count(distinct area) over (partition by group_id) as count_area,
    count(distinct pip) over (partition by group_id) as count_pip
  from mytable
)
where count_area > 1 or count_pip > 1;

答案 2 :(得分:0)

我首先获得的点数超过1个点或区域:

select group_id, area from table group by group_id having count(distinct pip) > 1 union all
select group_id, area from table group by group_id having count(distinct area) > 1;

然后将其与源表连接:

select t1.* from table t1 
            inner join (select group_id, area from table group by group_id having count(distinct pip) > 1 union all
                        select group_id, area from table group by group_id having count(distinct area) > 1) t2 
            on (t1.group_id = t2.group_id);

答案 3 :(得分:0)

这是一种方法。因为你需要&#34;计算不同的&#34;在多个列中,这不是Oracle SQL中的一个选项,您需要一个技巧。在这里,我连接了两列,它们之间有一个〜;它应该是一个不能成为&#34; area&#34;的最后一个字符的字符,例如。

with
     test_data ( group_id, area, pip ) as (
       select 432,  23, 'jack' from dual union all   
       select 432,  23, 'jack' from dual union all      
       select 745,  45, 'bill' from dual union all
       select 745,  45, 'bill' from dual union all
       select 848,  67, 'lynn' from dual union all
       select 848,  65, 'lynn' from dual union all
       select  23,  33, 'hop'  from dual union all
       select  23,  33, 'hope' from dual union all
       select 670, 893, 'sue'  from dual union all
       select 670, 893, 'sue'  from dual
     )
-- end of test data; solution (SQL query) begins below this line
select group_id, area, pip
from   (  select group_id, area, pip,
                 count(distinct to_char(area) || '~' || pip) 
                       over (partition by group_id) as ct
          from   test_data
       )
where  ct > 1
;

GROUP_ID  AREA  PIP
--------  ----  ----
      23    33  hop
      23    33  hope
     848    65  lynn
     848    67  lynn

答案 4 :(得分:0)

解决方案靠近所有其他解决方案,但我希望更容易理解。

第1步:查找完全相同的所有行

第2步:从步骤1中查找超过1行的所有Group_id

最后一步:加入所有数据行和所有重复的组

WITH 
MyData AS (
  SELECT 432 group_id, 23 area,'jack' pip from dual union all
  SELECT 432, 23,'jack' from dual union all
  SELECT 745, 45,'bill' from dual union all
  SELECT 745, 45,'bill' from dual union all
  SELECT 848, 67,'lynn' from dual union all
  SELECT 848, 65,'lynn' from dual union all
  SELECT 23, 33,'hop' from dual union all
  SELECT 23, 33,'hope' from dual union all
  SELECT 670, 893,'sue' from dual union all
  SELECT 670, 893,'sue' from dual union all
  SELECT 999, 123,'andreas' from dual 
), 
-- End of Data
MyEqualGroups AS (
  SELECT GROUP_ID, AREA, PIP FROM MyData GROUP BY GROUP_ID, AREA, PIP 
),
MyUnequalGroups AS (
  SELECT GROUP_ID, COUNT(*) FROM MyEqualGroups GROUP BY GROUP_ID HAVING COUNT(*) > 1
)
SELECT * 
FROM MyData D
JOIN MyUnequalGroups UG ON D.GROUP_ID = UG.GROUP_ID
;