WITH Clause SQL在删除行时抛出错误,但对于select statment工作正常

时间:2018-01-18 01:18:58

标签: sql amazon-redshift common-table-expression sql-delete

with de_duplicate (ad_id, id_type, lat, long) AS (
select ad_id, id_type, lat, long,
Row_Number() over(partition by ad_id,id_type, lat, long) AS duplicate_count
from tempschema.temp_test)
select * from de_duplicate;

以上运行成功,但是当我尝试执行删除操作时

with de_duplicate(ad_id, id_type, lat, long) AS 
(
select ad_id, id_type, lat, long,
Row_Number() over(partition by ad_id,id_type, lat, long) AS duplicate_count
from tempschema.temp_test
)
delete from de_duplicate where duplicate_count > 1;

它会抛出错误 亚马逊无效操作:语法错误在或附近"删除" 职位:190;

我在redshift集群上运行这些查询。有什么想法吗?

2 个答案:

答案 0 :(得分:0)

考虑将CTE转换为子查询并添加 unique_id 以匹配外部查询:

DELETE FROM tempschema.temp_test
WHERE unique_id NOT IN
  (SELECT sub.unique_id
   FROM 
      (SELECT unique_id, ad_id, id_type, lat, long,
              ROW_NUMBER() OVER (PARTITION BY ad_id, id_type, lat, long) AS dup_count
        FROM tempschema.temp_test) sub
   WHERE sub.dup_count > 1) 

或者,考虑使用聚合子查询进行删除:

DELETE FROM tempschema.temp_test
WHERE unique_id NOT IN
   (SELECT MIN(unique_id)
    FROM tempschema.temp_test
    GROUP BY ad_id, id_type, lat, long)

当然,两者都假设您在表格中有 unique_id ,但如果没有,则可以进行调整。

答案 1 :(得分:0)

我明白你要做什么,这是一个常见的问题,但这个方法有两个问题:

1)您尝试从查询结果(import math goal = 85 state = [-1] * (goal + 1) state[2] = 0 for k in range(2 , goal/2 + 1): if state[k] < 0: continue for pos, cost in [ (k*2, k), (k*2 + 1, math.floor(k/2)), (k*2+2, k+2)]: if pos > goal: continue if state[pos] == -1 or state[pos] > state[k] + cost: state[pos] = state[k] + cost # Possibly store k somewhere to build the solution. print state[goal] )中删除,而不是从源表(de_duplicate)中删除。即使您在tempschema.temp_test语句中识别重复项,它也与源表de_duplicate无关。

2)CTE(tempschema.temp_test子句)不能直接与WITHDELETE一起使用,它们需要连接子查询。

您的案例中有两种可能的方法:

1)如果您的表中有唯一ID和重复条件,则使用已连接的子查询(下面的测试用例中为UPDATE,因此id = 3且id = 4是重复的):

val

2)创建一个已清理的临时表并交换表:

create table test1 (id integer, val integer);
insert into test1 values (1,1),(2,2),(3,3),(4,3);

delete from test1 using (
    select *
    from (
        select *, row_number() over (partition by val order by id desc)
        from test1
    )
    where row_number>1
) s
where test1.id=s.id;