我有一个数据集,其中每一行都是薪水变化,并且我试图获取最后的薪水(最高),并提供所有详细信息,作为原因(如果薪水发生了变化)和日期最后更改,作为最后一列,我想要之前的薪水值。我怎么能得到它?
我已经使用MIN()完成了最后的薪水和第一次的薪水,但是我想要的是之前的薪水,而不是第一次的薪水。我得到的结果:
company_id name last_salary_change reason updated_at salary_before
29 Fulano 5000 promotion 2019-05-20 1200
29 Ramon 25000 adjustment 2019-03-23 11500
我使用的查询:
SELECT p.company_id,
u.name AS name,
MAX(psc.amount/100) AS last_salary,
MAX(psc.reason) AS reason,
MAX(psc.updated_at) AS updated,
MIN(psc.amount/100) AS first_salary
FROM lukla.profiles AS p
INNER JOIN lukla.profile_salary_changes AS psc
ON p.id = psc.profile_id
INNER JOIN lukla.users AS u
ON p.id = u.profile_id
WHERE p.company_id = 29 (filtered by a specific company)
GROUP BY 1, 2
我在寻找什么
company_id name last_salary_change reason updated_at salary_before
29 Fulano 5000 promotion 2019-05-20 3500
29 Ramon 25000 adjustment 2019-03-23 24000
答案 0 :(得分:1)
答案基于@Gordon的答案,唯一的区别是我添加了
JOIN lukla.profiles p
ON p.id = psc.profile_id
JOIN lukla.users u
ON p.id = u.profile_id
在第一个JOIN内。
SELECT p.company_id,
u.name AS name,
MAX(psc.amount/100) AS last_salary,
MAX(psc.reason) AS reason,
MAX(psc.updated_at) AS updated,
MIN(psc.amount/100) AS first_salary,
MAX(amount / 100) FILTER (WHERE seqnum = 2) as prev_salary
FROM lukla.profiles p
JOIN
(SELECT psc.*,
ROW_NUMBER() OVER (PARTITION BY p.company_id, u.name ORDER BY psc.updated_at) as seqnum
FROM lukla.profile_salary_changes psc
JOIN lukla.profiles p
ON p.id = psc.profile_id
JOIN lukla.users u
ON p.id = u.profile_id
) psc
ON p.id = psc.profile_id
JOIN lukla.users u
ON p.id = u.profile_id
WHERE p.company_id = 29
GROUP BY 1, 2;
答案 1 :(得分:0)
似乎您需要该栏的第二最高工资。您可以在下面的查询中尝试-
SELECT p.company_id,
u.name AS name,
MAX(psc.amount/100) AS last_salary,
MAX(psc.reason) AS reason,
MAX(psc.updated_at) AS updated,
(SELECT MAX(psc.amount/100)
FROM lukla.profile_salary_changes psc2
WHERE psc2.profile_id = profile_salary_changes.profile_id
AND (psc.amount/100) < (SELECT MAX(psc.amount/100)
FROM lukla.profile_salary_changes psc2
WHERE psc2.profile_id = profile_salary_changes.profile_id)) salary_before
FROM lukla.profiles AS p
INNER JOIN lukla.profile_salary_changes AS psc
ON p.id = psc.profile_id
INNER JOIN lukla.users AS u
ON p.id = u.profile_id
WHERE p.company_id = 29 (filtered by a specific company)
GROUP BY 1, 2
虽然我还没有测试过,但是应该可以。
答案 2 :(得分:0)
您可以将窗口函数与条件聚合一起使用:
SELECT p.company_id,
u.name AS name,
MAX(psc.amount/100) AS last_salary,
MAX(psc.reason) AS reason,
MAX(psc.updated_at) AS updated,
MIN(psc.amount/100) AS first_salary,
MAX(amount / 100) FILTER (WHERE seqnum = 2) as prev_salary
FROM lukla.profiles p INNER JOIN
(SELECT ps.*,
ROW_NUMBER() OVER (PARTITION BY p.company_id, u.name ORDER BY psc.updated_at) as seqnum
FROM lukla.profile_salary_changes psc
) psc
ON p.id = psc.profile_id INNER JOIN
lukla.users u
ON p.id = u.profile_id
WHERE p.company_id = 29 (filtered by a specific company)
GROUP BY 1, 2;
两条评论:
JOIN
没有引入新行。答案 3 :(得分:0)
恕我直言,整个概念存在重大问题。他们基于这样的假设:第一笔薪水始终是最低的,而最后一笔是高考,而薪水始终是向上的。这些都不是真的。 MAX可以用来获取最新信息的假设也导致这种结构充其量是很少见的。本节:
MAX(psc.amount/100) AS last_salary,
MAX(psc.reason) AS reason,
MAX(psc.updated_at) AS updated,
即使承认MAX psc.amount将获得正确的值,我们也可以假设日期也已正确选择。但是,由于某些原因,这并不适用,必须始终注意将文本的最大值与另一列结合使用。不需要最大文本对应于最大数量。
这是一个有趣的问题,因此以下内容解决了这些问题。视需要接受或忽略它。无论哪种方式,我都觉得很有趣。
--- create "tables" as cte
with profiles (id, company_id) as
( values (61,29)
, (62,29)
, (63,29)
, (64,29)
)
, profile_salary_changes (profile_id, amount, reason, updated_at) as
( values (64, 640000, 'Initial Hire', '2018-02-15'::date)
, (64, 611200, 'Salary cut 4.5% across the board Co wide.', '2018-07-05'::date)
, (64, 625600, '50% July''s cut recovery', '2018-12-09'::date)
, (64, 710000, 'Promotion', '2019-02-09'::date)
, (63, 630000, 'Initial ', '2019-02-15'::date)
, (63, 600000, 'Transfer.', '2019-07-05'::date)
, (63, 627500, 'COL', '2019-12-09'::date)
, (61, 100000, 'Initial Only', '2019-05-09'::date)
, (62, 620000, 'First', '2019-03-09'::date)
, (62, 625000, 'Bonus', '2019-08-09'::date)
)
, users (profile_id, name) as
( values (61, 'Test1')
, (62, 'Test2')
, (63, 'Test3')
, (64, 'Test4')
)
-- final selection pick up designated columns
select company_id
, name
, reason
, trunc(amount/100.0,2) last_salary
, updated_at
, trunc(prev_sal/100.0,2) prev_salary
, trunc(first_salary/100.0,2) first_salary
from ( -- pick up the appropriate previous and first salary
select x.*
, lag (amount) over( partition by company_id, profile_id order by updated_at ) prev_sal
, lag (amount, (rn-1)::integer) over( partition by company_id, profile_id order by updated_at ) first_salary
from (
-- gather all columns, number each row for company and profile, and get number of total rows for each set
select p.company_id
, u.name
, psc.*
, row_number() over( partition by p.company_id, psc.profile_id order by updated_at) rn
, count(*) over( partition by p.company_id, psc.profile_id) r
from profile_salary_changes psc
join profiles p on (p.id = psc.profile_id)
join users u on (p.id = u.profile_id)
) x
) z
-- select only the last row in each set. Additional salary values have been attached
where r = rn
order by name;
答案 4 :(得分:0)
假设您的数据库结构如下:https://www.db-fiddle.com/f/i2BNYKhSaiu1xGPDPfeydr/1
您可以运行以下查询:
SELECT p.company_id,
p.id,
last_salary.amount/100 AS last_salary,
last_salary.reason AS reason,
last_salary.updated_at AS updated,
prev_salary.updated_at as salary_before_updated_at,
prev_salary.amount/100 AS salary_before
FROM profiles AS p
LEFT JOIN profile_salary_changes AS last_salary
ON p.id = last_salary.profile_id
AND NOT EXISTS (
SELECT * FROM profile_salary_changes psc2
WHERE psc2.profile_id = last_salary.profile_id
AND psc2.updated_at > last_salary.updated_at)
LEFT JOIN profile_salary_changes AS prev_salary
ON p.id = prev_salary.profile_id
AND prev_salary.updated_at != last_salary.updated_at
AND NOT EXISTS (
SELECT * FROM profile_salary_changes psc3
WHERE psc3.profile_id = prev_salary.profile_id
AND psc3.updated_at < last_salary.updated_at
AND psc3.updated_at > prev_salary.updated_at)
WHERE p.company_id = 29;
如果您的id
表中有一个像profile_salary_changes
这样的主字段,最好在比较中使用它而不是updated_at