如何在不丢失订单的情况下消除重复

时间:2019-07-12 20:37:52

标签: sql oracle

我有一个表,该表可以动态地将税款与其概念相关联,但是查询可以返回多个概念和重复的税款,但是最后我必须删除所关注的记录

WITH test_data AS
(
  --PRODUCT | VALUE
  --:------ | ----:
  SELECT 125  AS ord, 'Ties' AS product, 'P' AS concept FROM dual UNION ALL
  SELECT 127  AS ord, 'tax',     'P'    FROM dual UNION ALL
  SELECT 345  AS ord, 'Stocks',  'T'    FROM dual UNION ALL
  SELECT 346  AS ord, 'tax',     'P'    FROM dual UNION ALL
  SELECT 58   AS ord, 'Shirts',  'P'    FROM dual UNION ALL
  SELECT 59   AS ord, 'tax',     'P'    FROM dual UNION ALL  
  SELECT 723  AS ord, 'Shirts',  'P'    FROM dual UNION ALL
  SELECT 724  AS ord, 'tax',     'P'    FROM dual UNION ALL
  SELECT 95   AS ord, 'Shirts',  'P'    FROM dual UNION ALL
  SELECT 96   AS ord, 'tax',     'P'    FROM dual UNION ALL
  SELECT 1102 AS ord, 'Stocks',  'T'    FROM dual UNION ALL
  SELECT 1103 AS ord, 'tax',     'T'    FROM dual UNION ALL
  SELECT 366  AS ord, 'Stocks',  'T'    FROM dual UNION ALL
  SELECT 367  AS ord, 'tax',     'T'    FROM dual UNION ALL
  SELECT 1555 AS ord, 'Pants',   'T'    FROM dual UNION ALL
  SELECT 1556 AS ord, 'tax',     'T'    FROM dual UNION ALL
  SELECT 1787 AS ord, 'Stocks',  'T'    FROM dual UNION ALL
  SELECT 1788 AS ord, 'tax',     'T'    FROM dual UNION ALL
  SELECT 197  AS ord, 'Shirts',  'P'    FROM dual UNION ALL
  SELECT 198  AS ord, 'tax',     'P'    FROM dual
), 
test_data_extended AS
(
  SELECT product, 
         concept, 
         LAG(product, 1) OVER (ORDER BY ord) AS pre_product,ord
  FROM test_data
), 
test_data_new AS
(
  SELECT product AS old_product, 
         concept,
         ord, 
         CASE WHEN product = 'tax' THEN 
              'tax (' || pre_product || ')' 
         ELSE product 
         END AS new_product
  FROM test_data_extended
), 
new_data AS
(
  SELECT UNIQUE ord, 
         new_product, 
         concept
  FROM test_data_new
order by concept
)SELECT * FROM   new_data

我找到了一种可能的解决方案,在该解决方案中,我可以消除重复项但丢失订单,应在下一行中为产品加税:

test_data_new AS
(
  SELECT product AS old_product, 
         concept,
         ord, 
         CASE WHEN product = 'tax' THEN 
              'tax (' || pre_product || ')' 
         ELSE product 
         END AS new_product,
         CASE
         CASE WHEN product = 'tax' THEN 
              1
         ELSE 0 
         END AS id_d
  FROM test_data_extended
), 
new_data AS
(
  SELECT UNIQUE ord, 
         new_product, 
         concept,
         id_d
  FROM test_data_new
  order by concept,id_d
)SELECT * FROM   new_data

预期结果将类似于以下内容:

 NEW_PRODUCT 
 -----------:
 Shirts      
 tax (Shirts)
 Ties        
 tax (Ties)  
 Pants       
 tax (Pants) 
 Stocks      
 tax (Stocks)

dbfiddle

2 个答案:

答案 0 :(得分:2)

根据您的数据和一些假设-特别是ord值不能交错,即使您在某项与其(希望)相关税款之间存在差距,也可以这样做:

WITH test_data AS
(
...
), 
test_data_new AS
(
  SELECT product AS old_product, 
         concept,
         ord, 
         CASE WHEN product = 'tax' THEN 
              'tax (' || LAG(product, 1) OVER (ORDER BY ord) || ')' 
         ELSE product 
         END AS new_product,
         CASE WHEN product = 'tax' THEN 
              LAG(ord, 1) OVER (ORDER BY ord) 
         ELSE ord
         END AS new_ord,
         CASE WHEN product = 'tax' THEN 
              1
         ELSE 0 
         END AS id_d
  FROM test_data
)
SELECT new_product
FROM test_data_new
GROUP BY new_product, id_d
ORDER BY min(ord), id_d;

NEW_PRODUCT 
------------
Shirts
tax (Shirts)
Ties
tax (Ties)
Stocks
tax (Stocks)
Pants
tax (Pants)

8 rows selected. 

我已经删除了您实际上不需要的两个CTE级别,但是主要的更改是添加了另一个lag(),将税项与非税项绑定到相同的ord之前的项目。

这一切似乎仍然有些脆弱,但至少可以处理您的数据。

db<>fiddle

答案 1 :(得分:0)

这是一个可能的解决方案,我以多种方式进行了尝试,并且可以实现,可以共享,以防有人需要它。

WITH test_data AS
(
...
), 
test_data_new AS
(
  SELECT product AS old_product, 
         concept,
         ord, 
         CASE WHEN product = 'tax' THEN 
              'tax (' || pre_product || ')' 
         ELSE product 
         END AS new_product
  FROM test_data_extended
), 
new_data AS
(
  SELECT UNIQUE new_product, 
         concept,
         ord
  FROM test_data_new
  ORDER BY concept,ord
),
order_data AS 
(
  SELECT new_product, 
         ord,
         concept
    FROM new_data 
ORDER BY 2
),
filter_data AS 
(
  SELECT new_product, 
         MIN(ord) ord,
         concept
    FROM order_data 
GROUP BY new_product, concept
ORDER BY 2
)
SELECT new_product
  FROM filter_data 

dbfiddle