我有一张类似下面的表格
Key CL EmailAddress CT Product1 Product2 Product3 Product4 Product5
1 X abc@gmail.com A 12 null null null null
2 X abc@gmail.com B 123 22 null null null
对于每一行,我最多可以有5个产品。每条记录至少有1个但少于5个产品。
此外,电子邮件地址可以在同一个CL中重复。
我必须编写一个查询来查找电子邮件地址是否在同一个CL中重复,如果是的话
我必须将productids合并到同一个电子邮件地址。
如果我有5件产品,我需要停止并排除其余产品。
所以上面例子的输出看起来应该是这样的
Key CL EmailAddress CT Product1 Product2 Product3 Product4 Product5
1 X abc@ gmail.com A+B 12 123 22 null null
我们可以在Oracle SQL查询中做这样的事情吗?
答案 0 :(得分:3)
Oracle设置
CREATE TABLE table_name (
"Key" INT PRIMARY KEY,
CL CHAR(1),
EmailAddress VARCHAR2(100),
CT VARCHAR2(100),
Product1 INT,
Product2 INT,
Product3 INT,
Product4 INT,
Product5 INT
);
INSERT INTO table_name
SELECT 1, 'X', 'abc@gmail.com', 'A', 12, null, null, null, null FROM DUAL
UNION ALL
SELECT 2, 'X', 'abc@gmail.com', 'B', 123, 22, null, null, null FROM DUAL;
CREATE TYPE stringlist AS TABLE OF VARCHAR2(100);
/
CREATE OR REPLACE FUNCTION nth_item(
collection STRINGLIST,
n INT
) RETURN VARCHAR2 DETERMINISTIC
AS
BEGIN
IF collection IS NULL OR n < 1 OR n > collection.COUNT THEN
RETURN NULL;
END IF;
RETURN collection(n);
END;
/
<强>查询强>:
SELECT "Key",
CL,
EmailAddress,
CT,
Nth_Item( products, 1 ) AS Product1,
Nth_Item( products, 2 ) AS Product2,
Nth_Item( products, 3 ) AS Product3,
Nth_Item( products, 4 ) AS Product4,
Nth_Item( products, 5 ) AS Product5
FROM (
SELECT MIN( "Key" ) AS "Key",
CL,
EmailAddress,
REGEXP_REPLACE(
LISTAGG( CT, '+' ) WITHIN GROUP ( ORDER BY CT ),
'(.)(\+\1)+',
'\1'
) AS CT,
CAST( COLLECT( COLUMN_VALUE ) AS stringlist ) AS products
FROM table_name t,
TABLE(
STRINGLIST(
t.Product1,
t.Product2,
t.Product3,
t.Product4,
t.Product5
)
)
WHERE COLUMN_VALUE IS NOT NULL
GROUP BY CL, EmailAddress
);
<强>输出强>:
Key CL EMAILADDRESS CT PRODUCT1 PRODUCT2 PRODUCT3 PRODUCT4 PRODUCT5
--- -- ------------- --- -------- -------- -------- -------- --------
1 X abc@gmail.com A+B 12 22 123
答案 1 :(得分:3)
作为MTO方法的替代方案,您可以从表格中取消数据:
select *
from your_table
unpivot (product for pos in (product1 as 1, product2 as 2, product3 as 3,
product4 as 4, product5 as 5));
KEY CL EMAILADDRESS CT POS PRODUCT
---------- -- ------------- --- ---------- ----------
1 X abc@gmail.com A 1 12
2 X abc@gmail.com B 1 123
2 X abc@gmail.com B 2 22
使用它来选择密钥并生成CT值(无耻地捏住MTO的正则表达式以删除重复项),并生成新的位置值:
with t as (
select *
from your_table
unpivot (product for pos in (product1 as 1, product2 as 2, product3 as 3,
product4 as 4, product5 as 5))
)
select min(key) over (partition by cl, emailaddress) as key,
cl,
emailaddress,
regexp_replace(
listagg(ct, '+') within group (order by key) over (partition by cl, emailaddress),
'(.)(\+\1)+', '\1') as ct,
rank() over (partition by cl, emailaddress order by key, pos) as pos,
product
from t;
KEY CL EMAILADDRESS CT POS PRODUCT
---------- -- ------------- --- ---------- ----------
1 X abc@gmail.com A+B 1 12
1 X abc@gmail.com A+B 2 123
1 X abc@gmail.com A+B 3 22
然后最后转回来:
with t as (
select *
from your_table
unpivot (product for pos in (product1 as 1, product2 as 2, product3 as 3,
product4 as 4, product5 as 5))
)
select key, cl, emailaddress, ct, a_product as product1, b_product as product2,
c_product as product3, d_product as product4, e_product as product5
from (
select min(key) over (partition by cl, emailaddress) as key,
cl,
emailaddress,
regexp_replace(
listagg(ct, '+') within group (order by key) over (partition by cl, emailaddress),
'(.)(\+\1)+', '\1') as ct,
rank() over (partition by cl, emailaddress order by key, pos) as pos,
product
from t
)
pivot (max(product) as product for (pos) in (1 as a, 2 as b, 3 as c, 4 as d, 5 as e));
KEY CL EMAILADDRESS CT PRODUCT1 PRODUCT2 PRODUCT3 PRODUCT4 PRODUCT5
---------- -- ------------- --- ---------- ---------- ---------- ---------- ----------
1 X abc@gmail.com A+B 12 123 22
通过使最终结果中的列名与原始表匹配,使其稍微复杂一些。我还假设你想要保持最低的键值,按键顺序链接CT值,并保持产品的顺序与它们最初出现的顺序相同 - 或者至少与原始产品中第一个键的产品保持一致订单,然后是他们的原始订单中的第二个键的产品等。
答案 2 :(得分:2)
另一种选择:
with test_data(Key1, CL, EmailAddress, CT, Product1, Product2, Product3, Product4, Product5)
as (
select 1, 'X', 'abc@gmail.com', 'A', 12 , null, null, null, null from DUAL union all
select 2, 'X', 'abc@gmail.com', 'B', 12 , 123, null, null, null from DUAL
)
select
min(KEY1) as KEY1,
CL,
EmailAddress,
case when instr(min(CT1), '+', 1, 5) = 0
then min(CT1)
else substr(min(CT1), 1, instr(min(CT1), '+', 1, 5)-1)
end CT,
max(PRODUCT1) PRODUCT1,
max(PRODUCT2) PRODUCT2,
max(PRODUCT3) PRODUCT3,
max(PRODUCT4) PRODUCT4,
max(PRODUCT5) PRODUCT5
from
(
select * from
(
select t.*,
row_number() over (partition by CL,EmailAddress order by key1) RN
FROM
(
select * from
(
select temp.*,
listagg(CT,'+') within group (order by key1) over (partition by CL, EmailAddress) CT1
from test_data temp
)
unpivot
(prod FOR col in (PRODUCT1,PRODUCT2,PRODUCT3,PRODUCT4,PRODUCT5))
order by KEY1
) t
) t1
where RN <=5
)
pivot
(
min(prod)
for RN in (1 as PRODUCT1,2 as PRODUCT2,3 as PRODUCT3,4 as PRODUCT4,5 as PRODUCT5)
)
group by CL, EmailAddress
;