我正在尝试编写一个SQL查询,该查询返回包含数据的表中的行:
表结构如下:
CREATE TABLE person(
id INT PRIMARY KEY,
name TEXT,
operation TEXT);
我想返回尚未“取消”的所有唯一名称行。 如果操作是“插入”或“删除”并且存在具有相同操作的相同名称的另一行,则认为行被“取消”。
例如,如果我有以下行
id name operation
1 bob insert
2 bob delete
3 bob insert
前两行相互“取消”,因为它们在相反的操作中共享相同的名称。因此查询应该返回第3行。
这是另一个例子:
id name operation
1 bob insert
2 bob delete
3 bob insert
4 bob delete
在这种情况下,行1和2取消,行3和4取消。所以查询不应该返回任何行。
最后一个例子:
id name operation
1 bob insert
2 bob insert
在这种情况下,行1和2不会取消,因为操作不相反。所以查询应该返回两行。
我有以下查询处理前两个场景,但它不处理最终场景。
有没有人对可以处理所有3种情况的查询有任何建议?
SELECT MAX(id),name
FROM person z
WHERE operation IN ('insert','delete')
GROUP BY name
HAVING count(1) % 2 = 1;
答案 0 :(得分:4)
一种方法是比较操作计数。因为您还需要获取与InsertCount相对应的INSERTS或DELETES的数量 - deleteCount或InsertCount - deleteCount,并且由于PostgreSQL支持window function,您应该能够使用row_number()。
注意:我没有对此进行过测试,但根据此PostgreSQL manual Chapter 3. Advanced Features, 3.5 Window functions,您可以在内联查询中引用窗口函数
SELECT
id, name
FROM
(
SELECT
row_number() over (partition by p.name, p.operation order by p.id desc) rn ,
id,
p.Name,
p.operation,
operationCounts.InsertCount,
operationCounts.deleteCount
FROM
Person p
INNER JOIN (
SELECT
SUM(CASE WHEN operation = 'insert' then 1 else 0 END) InsertCount,
SUM(CASE WHEN operation = 'delete' then 1 else 0 END) deleteCount,
name
FROM
person
GROUP BY
name ) operationCounts
ON p.name = operationCounts.name
WHERE
operationCounts.InsertCount <> operationCounts.deleteCount) data
WHERE
(rn <= (InsertCount - deleteCount)
and operation = 'insert')
OR
(rn <= (deleteCount - InsertCount)
and operation = 'delete')
答案 1 :(得分:1)
最佳速度和最短答案: 问题可以简化为
这可以用这种方式一次写入:(不知道此查询中的所有内容是否都有效)
select * from(
SELECT id, name,
row_number() over (partition by name order by case
when operation = 'insert'
then id
else null end
nulls last ) rnk_insert,
count(case
when operation='delete' then 1
else null
end) over (partition by name) as cnt_del
FROM person z
WHERE operation IN ('insert','delete')
)
where rnk_insert > cnt_del
如果以前不能使用postgres(AFAIK,Oracle可以处理它),那么解决方案可以通过这种更轻松的方式实现:
select i.id, i.name
from
(select id, name,
row_number over (partition by name order by id) as rnk_insert
from person z
where operation='insert') i
left join
(select name, count(*) as cnt_del
from person z
where operation='delete') d
on d.name = i.name
where rnk_insert > coalesce(cnt_del, 0)
答案 2 :(得分:0)
测试显示我的原始查询比@Conrad的优秀查询慢。很谦虚,我尝试了一些事情,并提出了一个实际 更简单,更快的查询。
INSERT INTO person
SELECT i
,'name' || (random() * 500)::int::text
,CASE WHEN random() >= 0.5 THEN 'insert' ELSE 'delete' END
FROM generate_series(1,10000) AS i;
SELECT id, name, operation
FROM (
SELECT row_number() OVER (PARTITION BY name, operation ORDER by id) AS rn
,id
,name
,operation
,y.cancel
FROM (
SELECT name
,least(ct_del, ct_all - ct_del) AS cancel
FROM (
SELECT name
,count(*) AS ct_all
,count(NULLIF(operation, 'insert')) AS ct_del
FROM person
GROUP BY 1
) x
WHERE (ct_all - ct_del) <> ct_del
) y
JOIN person USING (name)
) p
WHERE rn > cancel
最终与@Conrad的查询类似,只有一些简化/改进。关键是要消除在游戏早期被取消的名字。