因此,我有一个简单的表格,其中包含comments
中与user
相关的post
。
id | user | post_id | comment
----------------------------------------------------------
0 | john@test.com | 1001 | great article
1 | bob@test.com | 1001 | nice post
2 | john@test.com | 1002 | I agree
3 | john@test.com | 1001 | thats cool
4 | bob@test.com | 1002 | thanks for sharing
5 | bob@test.com | 1002 | really helpful
6 | steve@test.com | 1001 | spam post about pills
我希望获得用户在同一帖子上两次评论的所有实例(意思是相同的user
和相同的post_id
)。在这种情况下,我会回来:
id | user | post_id | comment
----------------------------------------------------------
0 | john@test.com | 1001 | great article
3 | john@test.com | 1001 | thats cool
4 | bob@test.com | 1002 | thanks for sharing
5 | bob@test.com | 1002 | really helpful
我认为DISTINCT
是我需要的,但这只是给了我独特的行。
答案 0 :(得分:2)
您可以使用GROUP BY
和HAVING
查找包含多个条目的user
和post_id
对:
SELECT a.*
FROM table_name a
JOIN (SELECT user, post_id
FROM table_name
GROUP BY user, post_id
HAVING COUNT(id) > 1
) b
ON a.user = b.user
AND a.post_id = b.post_id
答案 1 :(得分:0)
DISTINCT
会删除所有重复的行,这就是您获取唯一行的原因。
您可以尝试使用CROSS JOIN
(根据https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins提供的Hive 0.10):
SELECT mt.*
FROM MYTABLE mt
CROSS JOIN MYTABLE mt2
WHERE mt.user = mt2.user
AND mt.post_id = mt2.post_id
虽然表现可能不是最好的。如果您想对其进行排序,请使用SORT BY
或ORDER BY
。
答案 2 :(得分:0)
DECLARE @MyTable TABLE (id int, usr varchar(50), post_id int, comment varchar(50))
INSERT @MyTable (id, usr, post_id, comment) VALUES (0,'john@test.com',1001,'great article')
INSERT @MyTable (id, usr, post_id, comment) VALUES (1,'bob@test.com',1001,'nice post')
INSERT @MyTable (id, usr, post_id, comment) VALUES (3,'john@test.com',1002,'I agree')
INSERT @MyTable (id, usr, post_id, comment) VALUES (4,'john@test.com',1001,'thats cool')
INSERT @MyTable (id, usr, post_id, comment) VALUES (5,'bob@test.com',1002,'thanks for sharing')
INSERT @MyTable (id, usr, post_id, comment) VALUES (6,'bob@test.com',1002,'really helpful')
INSERT @MyTable (id, usr, post_id, comment) VALUES (7,'steve@test.com',1001,'spam post about pills')
SELECT
T1.id,
T1.usr,
T1.post_id,
T1.comment
FROM
@MyTable T1
INNER JOIN @MyTable T2
ON T1.usr = T2.usr AND T1.post_id = T2.post_id
GROUP BY
T1.id,
T1.usr,
T1.post_id,
T1.comment
HAVING
Count(T2.id) > 1