自联接查找重复但包括所有列

时间:2017-05-05 17:18:58

标签: sql join self-join

我希望匹配日志表中的任何条目,这些条目具有相同的daycause,它们不止一次出现在表中。我为已经重复的提取写了查询,我的问题是我需要访问表中结果中的所有列以供以后的JOIN使用。表看起来像这样:

| ID | DATE       | CAUSE | USER | ... |
|--------------------------------------|
| x  | 2017-01-01 | aaa   | 100  | ... |
| x  | 2017-01-02 | aaa   | 101  | ... |
| x  | 2017-01-03 | bbb   | 101  | ... |
| x  | 2017-01-03 | bbb   | 101  | ... |
| x  | 2017-01-04 | ccc   | 101  | ... |
| x  | 2017-01-04 | ccc   | 101  | ... |
| x  | 2017-01-04 | ccc   | 101  | ... |
| x  | 2017-01-05 | aaa   | 101  | ... |
| .....................................|
| .....................................|
| .....................................|

查询:

SELECT logs.* FROM 
    (SELECT day, cause FROM logs 
         GROUP BY day, cause HAVING COUNT(*) > 1) AS logsTwice, logs 
WHERE logsTwice.day = logs.day AND logsTwice.cause = logs.cause

子选择准确地获取正确的数据(日期和原因),但是当我尝试获取这些匹配的附加列时,我得到完全错误的数据。我做错了什么?

3 个答案:

答案 0 :(得分:0)

试试这个:

SELECT logs.* FROM logs
inner join 
(SELECT day, cause FROM logs GROUP BY day, cause HAVING COUNT(*) > 1) logsTwice
on logsTwice.day = logs.day AND logsTwice.cause = logs.cause

答案 1 :(得分:0)

您可以使用窗口功能:

SELECT l.*
FROM (SELECT l.*,
             COUNT(*) OVER (PARTITION BY day, cause) as cnt
      FROM logs l
     ) l
WHERE cnt > 1;

通常,窗口函数的性能优于使用JOINGROUP BY的等效查询。

答案 2 :(得分:0)

您可以尝试

SELECT l1.*
  FROM logs l1
 INNER JOIN logs l2
    ON (l1.id <> l2.id
        AND l1.day = l2.day
        AND l1.cause = l2.cause
        AND l1.user <> l2.user);