我有一个表,其中记录了同时订阅英语和北印度语频道的用户记录,我只希望那些订阅英语的用户
+--------+------+------+------+
| id | userid | Subscribedto |
+--------+------+------+------+
| 1 | 1 | English |
| 2 | 2 | English |
| 3 | 1 | Hindi |
| 4 | 3 | English |
| 4 | 3 | Hindi |
| 5 | 4 | English |
+--------+------+------+------+
结果将是
+--------+------+------+------+
| id | userid | Subscribedto |
+--------+------+------+------+
| 2 | 2 | English |
| 5 | 4 | English |
+--------+------+------+------+
答案 0 :(得分:1)
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(id SERIAL PRIMARY KEY
,user_id INT NOT NULL
,subscribed_to VARCHAR(12) NOT NULL
,UNIQUE KEY(user_id,subscribed_to)
);
INSERT INTO my_table VALUES
(1,1,'English'),
(2,2,'English'),
(3,1,'Hindi'),
(4,3,'English'),
(5,3,'Hindi'),
(6,4,'English');
SELECT DISTINCT x.*
FROM my_table x
LEFT
JOIN my_table y
ON y.user_id = x.user_id
AND y.subscribed_to <> x.subscribed_to
WHERE x.subscribed_to = 'English'
AND y.id IS NULL;
+----+---------+---------------+
| id | user_id | subscribed_to |
+----+---------+---------------+
| 2 | 2 | English |
| 6 | 4 | English |
+----+---------+---------------+
答案 1 :(得分:1)
假设subscribedto
列中没有空值,则可以使用NOT EXISTS:
select t.*
from tablename t
where not exists (
select 1 from tablename
where userid = t.userid and subscribedto <> 'English'
)
请参见demo。
结果:
| id | userid | subscribedto |
| --- | ------ | ------------ |
| 2 | 2 | English |
| 5 | 4 | English |
您还可以在条件子句中添加条件subscribedto = 'English'
:
select t.*
from tablename t
where subscribedto = 'English' and not exists (
select 1 from tablename
where userid = t.userid and subscribedto <> 'English'
)
结果是相同的,但是此版本的效率可能更高。
答案 2 :(得分:1)
分析功能将在单表扫描中执行此操作,无需昂贵的连接。它将在Hive中运行:
SELECT id, userid, subscribedto
FROM
(
SELECT id, userid, subscribedto,
max(case when subscribedto != 'English' then true else false end) over(partition by userid ) subscribed_not_english
FROM my_table s
)s
WHERE NOT subscribed_not_english