SQL:过滤行

时间:2012-01-11 17:08:10

标签: sql postgresql postgresql-9.1

我正在尝试编写一个SQL查询,该查询返回包含数据的表中的行:

表结构如下:

CREATE TABLE person(
    id INT PRIMARY KEY,
    name TEXT,
    operation TEXT);

我想返回尚未“取消”的所有唯一名称行。 如果操作是“插入”或“删除”并且存在具有相同操作的相同名称的另一行,则认为行被“取消”。

例如,如果我有以下行

id   name   operation
1    bob    insert
2    bob    delete
3    bob    insert

前两行相互“取消”,因为它们在相反的操作中共享相同的名称。因此查询应该返回第3行。

这是另一个例子:

id   name   operation
1    bob    insert
2    bob    delete
3    bob    insert
4    bob    delete

在这种情况下,行1和2取消,行3和4取消。所以查询不应该返回任何行。

最后一个例子:

id   name   operation
1    bob    insert
2    bob    insert

在这种情况下,行1和2不会取消,因为操作不相反。所以查询应该返回两行。

我有以下查询处理前两个场景,但它不处理最终场景。

有没有人对可以处理所有3种情况的查询有任何建议?

SELECT MAX(id),name 
FROM person z 
WHERE operation IN ('insert','delete') 
GROUP BY name 
HAVING count(1) % 2 = 1;

3 个答案:

答案 0 :(得分:4)

一种方法是比较操作计数。因为您还需要获取与InsertCount相对应的INSERTS或DELETES的数量 - deleteCount或InsertCount - deleteCount,并且由于PostgreSQL支持window function,您应该能够使用row_number()。

注意:我没有对此进行过测试,但根据此PostgreSQL manual Chapter 3. Advanced Features, 3.5 Window functions,您可以在内联查询中引用窗口函数

SELECT
       id, name
FROM
   (
    SELECT 
            row_number() over (partition by p.name, p.operation order by p.id desc) rn , 
            id,  
            p.Name,
            p.operation, 
            operationCounts.InsertCount,
            operationCounts.deleteCount

    FROM 
       Person p
    INNER JOIN (

        SELECT 
          SUM(CASE WHEN operation = 'insert' then 1 else 0 END) InsertCount,
          SUM(CASE WHEN operation = 'delete' then 1 else 0 END) deleteCount,
          name 
        FROM 
           person 
        GROUP BY
           name ) operationCounts
    ON p.name = operationCounts.name
    WHERE 
      operationCounts.InsertCount <> operationCounts.deleteCount) data
WHERE
      (rn <=  (InsertCount -  deleteCount)
      and operation = 'insert')
      OR
     (rn <=  (deleteCount -  InsertCount)
      and operation = 'delete')

答案 1 :(得分:1)

最佳速度和最短答案: 问题可以简化为

  1. 计算每个名称(cnt_del)的删除操作
  2. 忽略第一个cnt_del插入
  3. 这可以用这种方式一次写入:(不知道此查询中的所有内容是否都有效)

    select * from(
        SELECT id, name, 
           row_number() over (partition by name order by case 
                                                         when operation = 'insert' 
                                                         then id 
                                                         else null end 
                                                nulls last ) rnk_insert,
           count(case 
                 when operation='delete' then 1 
                 else null 
                 end) over (partition by name) as cnt_del 
        FROM person z 
        WHERE operation IN ('insert','delete') 
    )
    where rnk_insert > cnt_del
    

    如果以前不能使用postgres(AFAIK,Oracle可以处理它),那么解决方案可以通过这种更轻松的方式实现:

    select i.id, i.name 
    from
    
      (select id, name, 
             row_number over (partition by name order by id) as rnk_insert
      from person z
      where operation='insert') i
    
      left join 
    
      (select name, count(*) as cnt_del
      from person z 
      where operation='delete') d
    
      on d.name = i.name
    
    where rnk_insert > coalesce(cnt_del, 0)
    

答案 2 :(得分:0)

测试显示我的原始查询比@Conrad的优秀查询慢。很谦虚,我尝试了一些事情,并提出了一个实际 更简单,更快的查询。

测试设置

INSERT INTO person
SELECT i
      ,'name' || (random() * 500)::int::text
      ,CASE WHEN random() >= 0.5 THEN 'insert' ELSE 'delete' END
FROM   generate_series(1,10000) AS i;

查询:

SELECT id, name, operation
FROM  (
    SELECT row_number() OVER (PARTITION BY name, operation ORDER by id) AS rn
          ,id
          ,name
          ,operation
          ,y.cancel
    FROM  (
       SELECT name
             ,least(ct_del, ct_all - ct_del) AS cancel
       FROM  (
          SELECT name
                ,count(*) AS ct_all
                ,count(NULLIF(operation, 'insert')) AS ct_del
          FROM   person
          GROUP  BY 1
          )   x
       WHERE (ct_all - ct_del) <> ct_del
       )   y
    JOIN   person USING (name)
    )   p
WHERE  rn > cancel

最终与@Conrad的查询类似,只有一些简化/改进。关键是要消除在游戏早期被取消的名字。