如何查询共享另一个值的值

时间:2014-06-21 04:16:55

标签: sql postgresql greenplum

服务器上的Greenplum 4.2.2.4(如PostgreSQL 8.2)。

我有以下数据:

id    | user
------+------
12345 | bob
12345 | jane
12345 | mary
44455 | user1
44455 | user2
44455 | user3
67890 | bob
53756 | bob
53756 | bob
53756 | bob
25246 | jane
54383 | jane
54383 | jane
54383 | jane

我只想返回" id"由多个独特的用户"共享值。但是,我也会根据" user"的列表进行查询。我感兴趣的价值观。例如:

  

用户在哪里(' mary',' bob',' user2')

我希望查询返回:

id    | user
------+------
12345 | bob
12345 | jane
12345 | mary
44455 | user1
44455 | user2
44455 | user3

我该怎么做?

6 个答案:

答案 0 :(得分:1)

您可以使用窗口功能执行此操作:

select id, user
from (select t.*, min(user) over (partition by id) as minuser,
             max(user) over (partition by id) as maxuser
      from table t
     ) t
where minuser <> maxuser;

EDTI:没有窗口功能(我认为自Postgres 8.1以来我一直都参与其中,但我相信Erwin就此问题),你可以用joingroup by做同样的事情:

select t.id, t.user
from table t join
     (select user, min(user) as minuser, max(user) as maxuser
      from table t
      group by user
      having min(user) <> max(user)
     ) tu
     on t.user = tu.user;

答案 1 :(得分:0)

试试这个解决方案:

在t1中,重复的非唯一行(如&lt; 53756,bob&gt;)将转换为一个记录。

然后,在外括号中,仅与一个用户共享的那些id被过滤(如&lt; 25246,Jane&gt;或&lt; 53756,bob&gt;现在转换为一个记录)。

具有这些id的记录就是答案:

select *
    from OriginalTable
    where id in 
        (
        select id 
            from ( 
                select distinct id, user
                    from OriginalTable
                ) as t1
            group by id
            having count(*) > 1
        )

答案 2 :(得分:0)

尝试此查询。我在postgresql表中测试了350万行,大约花了1.7秒。

select id,
       uname 
from   (
    select 
           id,
           uname,
           count(*) over (partition by id,uname) as count_of_unique_id_share,
           count(*) over (partition by id) as count_of_id_share 
    from 
           (select * from (select distinct id,uname from <TABLE>) z 
        where  id in (select id from <TABLE> where uname in ('mary','bob','user2')))y ) x 
where 
        count_of_unique_id_share = 1 and count_of_id_share > 1

答案 3 :(得分:0)

SQL Fiddle

select id, "user"
from
    (
        select id
        from t
        group by id
        having
            count(distinct "user") > 1
            and
            array['mary','bob','user2']::varchar(5)[] && array_agg("user")
    ) s
    inner join
    t using (id)
order by id, user

答案 4 :(得分:0)

Postgres 8.2没有具有窗口功能(在8.4版本中引入) 因为你正在寻找行

  

“id”由多个唯一的“用户”值共享。

SELECT t2.id, t2.user
FROM   tbl t1
JOIN   tbl t2 USING (id)    -- retrieve all rows with same id
WHERE  t1.user IN ('mary','bob','user2')
AND    EXISTS (
   SELECT 1
   FROM   tbl
   WHERE  id = t1.id
   AND    user <> t1.user   -- at least one other user with same id
   )
ORDER  BY t2.id, t2.user;

名称具有象征意义。一个人不会使用reserved word user 作为标识符。

此变种可能更快:

SELECT id, user
FROM (
    SELECT id
    FROM   tbl t1
    WHERE  user IN ('mary','bob','user2')
    AND    EXISTS (
        SELECT 1
        FROM   tbl
        WHERE  id = t1.id
        AND    user <> t1.user
        )
    ) sub
JOIN   tbl USING (id)
ORDER  BY id, user;

根据您的请求,任一查询都会返回所有行 - 包括完整的重复项。如果您只想要不同的行:

SELECT DISTINCT id, user ...

答案 5 :(得分:0)

CREATE TABLE users( id INTEGER NOT NULL
        , username varchar
        );

INSERT INTO users (id, username) VALUES
  (12345 , 'bob' )
, (12345 , 'jane' )
, (12345 , 'mary' )
, (44455 , 'user1' )
, (44455 , 'user2' )
, (44455 , 'user3' )
, (67890 , 'bob' )
, (53756 , 'bob' )
, (53756 , 'bob' )
, (53756 , 'bob' )
, (25246 , 'jane' )
, (54383 , 'jane' )
, (54383 , 'jane' )
, (54383 , 'jane' )
        ;

SELECT *
FROM users u1
WHERE EXISTS (
        SELECT *
        FROM users u2
        -- id must at least have one of these three usernames
        WHERE u2.username IN ('mary','bob','user2')
        AND u2.id = u1.id
        AND EXISTS (
                SELECT *
                FROM users u3
                WHERE u3.id = u2.id
                -- and there must exist a different username for this id
                AND u3.username <> u2.username
                )
        );

结果:

CREATE TABLE
INSERT 0 14
  id   | username 
-------+----------
 12345 | bob
 12345 | jane
 12345 | mary
 44455 | user1
 44455 | user2
 44455 | user3
(6 rows)