在2个字段上应用distinct,并为每列提取唯一数据

时间:2013-11-26 09:53:32

标签: sql oracle

根据一些奇怪的要求,我需要选择两列中所有输出值都应该唯一的记录。

输入看起来像这样:

col1   col2
1       x
1       y
2       x
2       y
3       x
3       y
3       z

预期输出为:

col1  col2
1     x
2     y
3     z

col1  col2
1     y
2     x
3     z

我尝试在2个字段上应用distinct,但是返回所有记录,因为它们在两个字段上都是不同的。我们想要做的是,如果col1中存在任何值,则它不能在col2中重复。

如果有可能,请告诉我,如果是,请告诉我们。

5 个答案:

答案 0 :(得分:1)

您可以使用完整外部联接将两个编号列表合并在一起:

SELECT  col1, col2
FROM  ( SELECT col1, ROW_NUMBER() OVER ( ORDER BY col1 ) col1_num
        FROM   your_table
        GROUP BY col1 )
  FULL JOIN 
      ( SELECT col2, ROW_NUMBER() OVER ( ORDER BY col2 ) col2_num
        FROM   your_table
        GROUP BY col2 )
  ON  col1_num = col2_num

如果您需要不同的订单,请更改ORDER BY,如果您愿意让Oracle做出决定,请使用ORDER BY NULL。

答案 1 :(得分:1)

很大的问题! Armunin在这里找到了更深层次的结构问题,这是一个递归的可枚举问题描述,只能通过递归解决方案解决 - 基本关系运算符(join / union / etc)不会让你到达那里。正如Armunin所引用的,一种方法是引出PL / SQL,虽然我没有详细检查过,但我认为PL / SQL代码可以正常工作。但是,Oracle非常友好地支持递归SQL,通过它我们可以只用SQL构建解决方案:

- 注意 - 此SQL将生成每个解决方案 - 您需要在最后过滤SOLUTION_NUMBER = 1

with t as (
select 1 col1, 'x' col2 from dual union all
select 1 col1, 'y' col2 from dual union all
select 2 col1, 'x' col2 from dual union all
select 2 col1, 'y' col2 from dual union all
select 3 col1, 'x' col2 from dual union all
select 3 col1, 'y' col2 from dual union all
select 3 col1, 'z' col2 from dual
), 
t0 as 
    (select t.*, 
            row_number() over (order by col1) id, 
            dense_rank() over (order by col2) c2_rnk 
     from t),
-- recursive step...
t1 (c2_rnk,ids, str) as
    (-- base row
     select c2_rnk, '('||id||')' ids, '('||col1||')' str 
     from   t0 
     where  c2_rnk=1
     union all
     -- induction
     select t0.c2_rnk, ids||'('||t0.id||')' ids, str||','||'('||t0.col1||')' 
     from   t1, t0 
     where  t0.c2_rnk = t1.c2_rnk+1 
            and instr(t1.str,'('||t0.col1||')') =0
    ),
t2 as 
    (select t1.*, 
            rownum solution_number 
     from   t1 
     where  c2_rnk = (select max(c2_rnk) from t1)
    )
select  solution_number, col1, col2 
from    t0, t2 
where   instr(t2.ids,'('||t0.id||')') <> 0
order by 1,2,3


SOLUTION_NUMBER       COL1    COL2 
1                     1       x    
1                     2       y    
1                     3       z    
2                     1       y    
2                     2       x    
2                     3       z    

答案 2 :(得分:0)

如果是另一行,结果会是什么? col1值为1,col2值为xx

在这种情况下,单行更好:

SELECT DISTINCT TO_CHAR(col1) FROM your_table
UNION ALL
SELECT DISTINCT col2 FROM your_table;

答案 3 :(得分:0)

我的建议是这样的:

begin
    EXECUTE IMMEDIATE 'CREATE global TEMPORARY TABLE tmp(col1 NUMBER, col2 VARCHAR2(50))';
end;
/
DECLARE
    cur_print sys_refcursor;
    col1 NUMBER;
    col2 VARCHAR(50);
    CURSOR cur_dist
    IS
        SELECT DISTINCT
            col1
        FROM
            ttable;
    filtered sys_refcursor;
BEGIN
    FOR rec IN cur_dist
    LOOP
        INSERT INTO tmp
        SELECT
            col1,
            col2
        FROM
            ttable t1
        WHERE
            t1.col1         = rec.col1
        AND t1.col2 NOT IN
            (
                SELECT
                    tmp.col2
                FROM
                    tmp
            )
        AND t1.col1 NOT IN
            (
                SELECT
                    tmp.col1
                FROM
                    tmp
            )
        AND ROWNUM = 1;
    END LOOP;

    FOR rec in (select col1, col2 from tmp) LOOP
        DBMS_OUTPUT.PUT_LINE('col1: ' || rec.col1 || '|| col2: ' || rec.col2);
    END LOOP;

    EXECUTE IMMEDIATE 'DROP TABLE tmp';
END;
/

可能仍需要一些改进,我对ROWNUM = 1部分特别不满意。

答案 4 :(得分:0)

SQL Fiddle

Oracle 11g R2架构设置

CREATE TABLE tbl ( col1, col2 ) AS
          SELECT 1, 'x' FROM DUAL
UNION ALL SELECT 1, 'y' FROM DUAL
UNION ALL SELECT 2, 'x' FROM DUAL
UNION ALL SELECT 2, 'y' FROM DUAL
UNION ALL SELECT 3, 'x' FROM DUAL
UNION ALL SELECT 3, 'y' FROM DUAL
UNION ALL SELECT 4, 'z' FROM DUAL;

查询1

WITH c1 AS (
  SELECT  DISTINCT
          col1,
          DENSE_RANK() OVER (ORDER BY col1) AS rank
  FROM    tbl
),
c2 AS (
  SELECT  DISTINCT
          col2,
          DENSE_RANK() OVER (ORDER BY col2) AS rank
  FROM    tbl
)
SELECT c1.col1,
       c2.col2
FROM   c1
       FULL OUTER JOIN c2
       ON ( c1.rank = c2.rank)
ORDER BY COALESCE( c1.rank, c2.rank)

<强> Results

| COL1 |   COL2 |
|------|--------|
|    1 |      x |
|    2 |      y |
|    3 |      z |
|    4 | (null) |

并解决额外要求:

  

我们想要做的是,如果col1中存在任何值,则它不能在col2中重复。

查询2

WITH c1 AS (
  SELECT  DISTINCT
          col1,
          DENSE_RANK() OVER (ORDER BY col1) AS rank
  FROM    tbl
),
c2 AS (
  SELECT  DISTINCT
          col2,
          DENSE_RANK() OVER (ORDER BY col2) AS rank
  FROM    tbl
  WHERE   col2 NOT IN ( SELECT TO_CHAR( col1 ) FROM c1 )
)
SELECT c1.col1,
       c2.col2
FROM   c1
       FULL OUTER JOIN c2
       ON ( c1.rank = c2.rank)
ORDER BY COALESCE( c1.rank, c2.rank)