Postgresql分组为多行

时间:2018-10-05 06:30:04

标签: sql postgresql duplicates postgresql-9.3

我有一个名为hr_holidays_by_calendar的表。我只想过滤出同一名员工在同一天有两张叶子的行

hr_holidays_by_calendar

enter image description here

我尝试过的查询:
解决这个问题的地方不远。

select hol1.employee_id, hol1.leave_date, hol1.no_of_days, hol1.leave_state
from hr_holidays_by_calendar hol1
inner join
    (select employee_id, leave_date 
    from hr_holidays_by_calendar hol1
    group by employee_id, leave_date 
    having count(*)>1)sub
on hol1.employee_id=sub.employee_id and hol1.leave_date=sub.leave_date
where hol1.leave_state != 'refuse'
order by hol1.employee_id, hol1.leave_date

4 个答案:

答案 0 :(得分:2)

这将返回 存在重复的所有行

SELECT employee_id, leave_date, no_of_days, leave_state
FROM   hr_holidays_by_calendar h
WHERE  EXISTS (
   SELECT                         -- select list can be empty for this
   FROM   hr_holidays_by_calendar
   WHERE  employee_id = h.employee_id
   AND    leave_date = h.leave_date
   AND    leave_state <> 'refuse'
   AND    ctid <> h.ctid
   )
AND    leave_state <> 'refuse'
ORDER  BY employee_id, leave_date;

目前尚不清楚应在何处使用leave_state <> 'refuse'。您将必须定义需求。我的示例完全忽略了带有leave_state = 'refuse'(和带有leave_state IS NULL的行)。

ctid是您未声明(未定义?)主键的穷人代理。

相关:

答案 1 :(得分:0)

我认为您只需要颠倒逻辑即可。您可以使用NOT EXISTS

select h1.employee_id, h1.leave_date, h1.no_of_days, h1.leave_state
from hr_holidays_by_calendar h1
where 
  h1.leave_state <> 'refuse'
  and not exists (
    select 1
    from hr_holidays_by_calendar h2
    where 
      h1.employee_id = h2.employee_id
      and h1.leave_date = h2.leave_date
      group by employee_id, leave_date
      having count(*) > 1
  )

这将丢弃行(在同一天离开)的每一对(雇员,日期)对。

我没有考虑天数,因为无论如何这似乎是错误的-您不能在同一天两次休假,休假时间不同。如果您的应用程序允许,请考虑应用其他逻辑。另外,您不应该首先让这些记录进入表中:-)

答案 2 :(得分:0)

我相信简单地使用GROUP BY就能为您完成工作

select hol1.employee_id, hol1.leave_date, max(hol1.no_of_days)
from hr_holidays_by_calendar hol1
where hol1.leave_state != 'refuse'
group by hol1.employee_id, hol1.leave_date

尚不清楚如果两行具有不同的no_of_days会发生什么。

答案 3 :(得分:0)

如果需要完整的行,则一种方法使用窗口函数:

select hc.*
from (select hc.*, count(*) over (partition by employee_id, leave_date) as cnt
      from hr_holidays_by_calendar hc
     ) hc
where cnt >= 2;

如果只需要员工ID和日期,则聚合是合适的。