服务器上的Greenplum 4.2.2.4(如PostgreSQL 8.2)。
我有以下数据:
id | user
------+------
12345 | bob
12345 | jane
12345 | mary
44455 | user1
44455 | user2
44455 | user3
67890 | bob
53756 | bob
53756 | bob
53756 | bob
25246 | jane
54383 | jane
54383 | jane
54383 | jane
我只想返回" id"由多个独特的用户"共享值。但是,我也会根据" user"的列表进行查询。我感兴趣的价值观。例如:
用户在哪里(' mary',' bob',' user2')
我希望查询返回:
id | user
------+------
12345 | bob
12345 | jane
12345 | mary
44455 | user1
44455 | user2
44455 | user3
我该怎么做?
答案 0 :(得分:1)
您可以使用窗口功能执行此操作:
select id, user
from (select t.*, min(user) over (partition by id) as minuser,
max(user) over (partition by id) as maxuser
from table t
) t
where minuser <> maxuser;
EDTI:没有窗口功能(我认为自Postgres 8.1以来我一直都参与其中,但我相信Erwin就此问题),你可以用join
和group by
做同样的事情:
select t.id, t.user
from table t join
(select user, min(user) as minuser, max(user) as maxuser
from table t
group by user
having min(user) <> max(user)
) tu
on t.user = tu.user;
答案 1 :(得分:0)
试试这个解决方案:
在t1中,重复的非唯一行(如&lt; 53756,bob&gt;)将转换为一个记录。
然后,在外括号中,仅与一个用户共享的那些id被过滤(如&lt; 25246,Jane&gt;或&lt; 53756,bob&gt;现在转换为一个记录)。
具有这些id的记录就是答案:
select *
from OriginalTable
where id in
(
select id
from (
select distinct id, user
from OriginalTable
) as t1
group by id
having count(*) > 1
)
答案 2 :(得分:0)
尝试此查询。我在postgresql表中测试了350万行,大约花了1.7秒。
select id,
uname
from (
select
id,
uname,
count(*) over (partition by id,uname) as count_of_unique_id_share,
count(*) over (partition by id) as count_of_id_share
from
(select * from (select distinct id,uname from <TABLE>) z
where id in (select id from <TABLE> where uname in ('mary','bob','user2')))y ) x
where
count_of_unique_id_share = 1 and count_of_id_share > 1
答案 3 :(得分:0)
select id, "user"
from
(
select id
from t
group by id
having
count(distinct "user") > 1
and
array['mary','bob','user2']::varchar(5)[] && array_agg("user")
) s
inner join
t using (id)
order by id, user
答案 4 :(得分:0)
Postgres 8.2没有具有窗口功能(在8.4版本中引入) 因为你正在寻找行
“id”由多个唯一的“用户”值共享。
SELECT t2.id, t2.user
FROM tbl t1
JOIN tbl t2 USING (id) -- retrieve all rows with same id
WHERE t1.user IN ('mary','bob','user2')
AND EXISTS (
SELECT 1
FROM tbl
WHERE id = t1.id
AND user <> t1.user -- at least one other user with same id
)
ORDER BY t2.id, t2.user;
名称具有象征意义。一个人不会使用reserved word 作为标识符。user
此变种可能更快:
SELECT id, user
FROM (
SELECT id
FROM tbl t1
WHERE user IN ('mary','bob','user2')
AND EXISTS (
SELECT 1
FROM tbl
WHERE id = t1.id
AND user <> t1.user
)
) sub
JOIN tbl USING (id)
ORDER BY id, user;
根据您的请求,任一查询都会返回所有行 - 包括完整的重复项。如果您只想要不同的行:
SELECT DISTINCT id, user ...
答案 5 :(得分:0)
CREATE TABLE users( id INTEGER NOT NULL
, username varchar
);
INSERT INTO users (id, username) VALUES
(12345 , 'bob' )
, (12345 , 'jane' )
, (12345 , 'mary' )
, (44455 , 'user1' )
, (44455 , 'user2' )
, (44455 , 'user3' )
, (67890 , 'bob' )
, (53756 , 'bob' )
, (53756 , 'bob' )
, (53756 , 'bob' )
, (25246 , 'jane' )
, (54383 , 'jane' )
, (54383 , 'jane' )
, (54383 , 'jane' )
;
SELECT *
FROM users u1
WHERE EXISTS (
SELECT *
FROM users u2
-- id must at least have one of these three usernames
WHERE u2.username IN ('mary','bob','user2')
AND u2.id = u1.id
AND EXISTS (
SELECT *
FROM users u3
WHERE u3.id = u2.id
-- and there must exist a different username for this id
AND u3.username <> u2.username
)
);
结果:
CREATE TABLE
INSERT 0 14
id | username
-------+----------
12345 | bob
12345 | jane
12345 | mary
44455 | user1
44455 | user2
44455 | user3
(6 rows)