SQL如何合并Oracle的记录

时间:2016-05-10 18:38:53

标签: sql oracle oracle11g

我有一张类似下面的表格

Key CL  EmailAddress    CT  Product1    Product2    Product3    Product4    Product5
1   X   abc@gmail.com   A   12          null        null        null        null
2   X   abc@gmail.com   B   123         22          null        null      null

对于每一行,我最多可以有5个产品。每条记录至少有1个但少于5个产品。

此外,电子邮件地址可以在同一个CL中重复。

我必须编写一个查询来查找电子邮件地址是否在同一个CL中重复,如果是的话

我必须将productids合并到同一个电子邮件地址。

如果我有5件产品,我需要停止并排除其余产品。

所以上面例子的输出看起来应该是这样的

Key CL  EmailAddress    CT  Product1    Product2    Product3    Product4    Product5
1   X   abc@ gmail.com  A+B 12          123         22          null     null

我们可以在Oracle SQL查询中做这样的事情吗?

3 个答案:

答案 0 :(得分:3)

Oracle设置

CREATE TABLE table_name (
  "Key"        INT PRIMARY KEY,
  CL           CHAR(1),
  EmailAddress VARCHAR2(100),
  CT           VARCHAR2(100),
  Product1     INT,
  Product2     INT,
  Product3     INT,
  Product4     INT,
  Product5     INT
);

INSERT INTO table_name
SELECT 1, 'X', 'abc@gmail.com', 'A', 12,  null, null, null, null FROM DUAL
UNION ALL
SELECT 2, 'X', 'abc@gmail.com', 'B', 123,   22, null, null, null FROM DUAL;

CREATE TYPE stringlist AS TABLE OF VARCHAR2(100);
/

CREATE OR REPLACE FUNCTION nth_item(
  collection STRINGLIST,
  n          INT
) RETURN VARCHAR2 DETERMINISTIC
AS
BEGIN
  IF collection IS NULL OR n < 1 OR n > collection.COUNT THEN
    RETURN NULL;
  END IF;
  RETURN collection(n);
END;
/

<强>查询

SELECT "Key",
       CL,
       EmailAddress,
       CT,
       Nth_Item( products, 1 ) AS Product1,
       Nth_Item( products, 2 ) AS Product2,
       Nth_Item( products, 3 ) AS Product3,
       Nth_Item( products, 4 ) AS Product4,
       Nth_Item( products, 5 ) AS Product5
FROM   (
  SELECT MIN( "Key" ) AS "Key",
         CL,
         EmailAddress,
         REGEXP_REPLACE(
           LISTAGG( CT, '+' ) WITHIN GROUP ( ORDER BY CT ),
           '(.)(\+\1)+',
           '\1'
         ) AS CT,
         CAST( COLLECT( COLUMN_VALUE ) AS stringlist ) AS products
  FROM   table_name t,
         TABLE(
           STRINGLIST(
             t.Product1,
             t.Product2,
             t.Product3,
             t.Product4,
             t.Product5
           )
         )
  WHERE  COLUMN_VALUE IS NOT NULL
  GROUP BY CL, EmailAddress
);

<强>输出

Key CL EMAILADDRESS  CT  PRODUCT1 PRODUCT2 PRODUCT3 PRODUCT4 PRODUCT5
--- -- ------------- --- -------- -------- -------- -------- --------
  1 X  abc@gmail.com A+B       12       22      123

答案 1 :(得分:3)

作为MTO方法的替代方案,您可以从表格中取消数据:

select *
from your_table
unpivot (product for pos in (product1 as 1, product2 as 2, product3 as 3,
  product4 as 4, product5 as 5));

       KEY CL EMAILADDRESS  CT         POS    PRODUCT
---------- -- ------------- --- ---------- ----------
         1 X  abc@gmail.com A            1         12
         2 X  abc@gmail.com B            1        123
         2 X  abc@gmail.com B            2         22

使用它来选择密钥并生成CT值(无耻地捏住MTO的正则表达式以删除重复项),并生成新的位置值:

with t as (
  select *
  from your_table
  unpivot (product for pos in (product1 as 1, product2 as 2, product3 as 3,
    product4 as 4, product5 as 5))
)
select min(key) over (partition by cl, emailaddress) as key,
  cl,
  emailaddress,
  regexp_replace(
    listagg(ct, '+') within group (order by key) over (partition by cl, emailaddress),
    '(.)(\+\1)+', '\1') as ct,
  rank() over (partition by cl, emailaddress order by key, pos) as pos,
  product
from t;

       KEY CL EMAILADDRESS  CT         POS    PRODUCT
---------- -- ------------- --- ---------- ----------
         1 X  abc@gmail.com A+B          1         12
         1 X  abc@gmail.com A+B          2        123
         1 X  abc@gmail.com A+B          3         22

然后最后转回来:

with t as (
  select *
  from your_table
  unpivot (product for pos in (product1 as 1, product2 as 2, product3 as 3,
    product4 as 4, product5 as 5))
)
select key, cl, emailaddress, ct, a_product as product1, b_product as product2,
  c_product as product3, d_product as product4, e_product as product5
from (
  select min(key) over (partition by cl, emailaddress) as key,
    cl,
    emailaddress,
    regexp_replace(
      listagg(ct, '+') within group (order by key) over (partition by cl, emailaddress),
      '(.)(\+\1)+', '\1') as ct,
    rank() over (partition by cl, emailaddress order by key, pos) as pos,
    product
  from t
)
pivot (max(product) as product for (pos) in (1 as a, 2 as b, 3 as c, 4 as d, 5 as e));

       KEY CL EMAILADDRESS  CT    PRODUCT1   PRODUCT2   PRODUCT3   PRODUCT4   PRODUCT5
---------- -- ------------- --- ---------- ---------- ---------- ---------- ----------
         1 X  abc@gmail.com A+B         12        123         22                      

通过使最终结果中的列名与原始表匹配,使其稍微复杂一些。我还假设你想要保持最低的键值,按键顺序链接CT值,并保持产品的顺序与它们最初出现的顺序相同 - 或者至少与原始产品中第一个键的产品保持一致订单,然后是他们的原始订单中的第二个键的产品等。

答案 2 :(得分:2)

另一种选择:

with test_data(Key1, CL,  EmailAddress,    CT,  Product1,    Product2,    Product3,    Product4,    Product5)
as (
select 1, 'X', 'abc@gmail.com', 'A', 12 , null, null, null, null from DUAL union all
select 2, 'X', 'abc@gmail.com', 'B', 12 , 123,  null, null, null from DUAL   
)

select 
  min(KEY1) as KEY1,
  CL,
  EmailAddress,
  case when instr(min(CT1), '+', 1, 5) = 0
       then min(CT1)
       else substr(min(CT1), 1, instr(min(CT1), '+', 1, 5)-1)
  end CT,
  max(PRODUCT1) PRODUCT1,
  max(PRODUCT2) PRODUCT2,
  max(PRODUCT3) PRODUCT3,
  max(PRODUCT4) PRODUCT4,
  max(PRODUCT5) PRODUCT5
from
(
  select * from
    (
      select t.*, 
      row_number() over (partition by CL,EmailAddress order by key1) RN
      FROM
      (
         select *  from 
         ( 
           select temp.*,
                  listagg(CT,'+') within group (order by key1) over (partition by CL, EmailAddress) CT1
           from test_data temp
         )
         unpivot
         (prod FOR col in (PRODUCT1,PRODUCT2,PRODUCT3,PRODUCT4,PRODUCT5))
         order by KEY1
      ) t
    ) t1
    where RN <=5
  ) 
  pivot
  (
    min(prod)
    for RN in (1 as PRODUCT1,2 as PRODUCT2,3 as PRODUCT3,4 as PRODUCT4,5 as PRODUCT5)
  )      
group by CL, EmailAddress
;