Question

我有一个数据集，其中每一行都是薪水变化，并且我试图获取最后的薪水（最高），并提供所有详细信息，作为原因（如果薪水发生了变化）和日期最后更改，作为最后一列，我想要之前的薪水值。我怎么能得到它？

我已经使用MIN（）完成了最后的薪水和第一次的薪水，但是我想要的是之前的薪水，而不是第一次的薪水。我得到的结果：

company_id   name   last_salary_change   reason   updated_at   salary_before

29   Fulano   5000  promotion   2019-05-20   1200
29   Ramon   25000  adjustment   2019-03-23   11500

我使用的查询：

SELECT p.company_id,
       u.name AS name,
       MAX(psc.amount/100) AS last_salary,
       MAX(psc.reason) AS reason,
       MAX(psc.updated_at) AS updated,
       MIN(psc.amount/100) AS first_salary
FROM lukla.profiles AS p

INNER JOIN lukla.profile_salary_changes AS psc
  ON p.id = psc.profile_id
INNER JOIN lukla.users AS u
  ON p.id = u.profile_id

WHERE p.company_id = 29 (filtered by a specific company)

GROUP BY 1, 2

我在寻找什么

company_id   name   last_salary_change   reason   updated_at   salary_before

29   Fulano   5000  promotion   2019-05-20   3500
29   Ramon   25000  adjustment   2019-03-23   24000

Answer 1

答案基于@Gordon的答案，唯一的区别是我添加了

      JOIN lukla.profiles p
      ON p.id = psc.profile_id 
      JOIN lukla.users u
      ON p.id = u.profile_id

在第一个JOIN内。

SELECT p.company_id,
       u.name AS name,
       MAX(psc.amount/100) AS last_salary,
       MAX(psc.reason) AS reason,
       MAX(psc.updated_at) AS updated,
       MIN(psc.amount/100) AS first_salary,
       MAX(amount / 100) FILTER (WHERE seqnum = 2) as prev_salary
FROM lukla.profiles p 

JOIN
    (SELECT psc.*,
             ROW_NUMBER() OVER (PARTITION BY p.company_id, u.name ORDER BY psc.updated_at) as seqnum
      FROM lukla.profile_salary_changes psc
      JOIN lukla.profiles p
      ON p.id = psc.profile_id 
      JOIN lukla.users u
      ON p.id = u.profile_id

     ) psc
  ON p.id = psc.profile_id 
JOIN lukla.users u
  ON p.id = u.profile_id

WHERE p.company_id = 29
GROUP BY 1, 2;

Answer 2

似乎您需要该栏的第二最高工资。您可以在下面的查询中尝试-

SELECT p.company_id,
       u.name AS name,
       MAX(psc.amount/100) AS last_salary,
       MAX(psc.reason) AS reason,
       MAX(psc.updated_at) AS updated,
       (SELECT MAX(psc.amount/100)
        FROM lukla.profile_salary_changes psc2
        WHERE psc2.profile_id = profile_salary_changes.profile_id
        AND (psc.amount/100) < (SELECT MAX(psc.amount/100)
                                FROM lukla.profile_salary_changes psc2
                                WHERE psc2.profile_id = profile_salary_changes.profile_id)) salary_before
FROM lukla.profiles AS p
INNER JOIN lukla.profile_salary_changes AS psc
      ON p.id = psc.profile_id
INNER JOIN lukla.users AS u
ON p.id = u.profile_id
WHERE p.company_id = 29 (filtered by a specific company)
GROUP BY 1, 2

虽然我还没有测试过，但是应该可以。

Answer 3

您可以将窗口函数与条件聚合一起使用：

SELECT p.company_id,
       u.name AS name,
       MAX(psc.amount/100) AS last_salary,
       MAX(psc.reason) AS reason,
       MAX(psc.updated_at) AS updated,
       MIN(psc.amount/100) AS first_salary,
       MAX(amount / 100) FILTER (WHERE seqnum = 2) as prev_salary
FROM lukla.profiles p INNER JOIN
     (SELECT ps.*,
             ROW_NUMBER() OVER (PARTITION BY p.company_id, u.name ORDER BY psc.updated_at) as seqnum
      FROM lukla.profile_salary_changes psc
     ) psc
     ON p.id = psc.profile_id INNER JOIN
     lukla.users u
     ON p.id = u.profile_id
WHERE p.company_id = 29 (filtered by a specific company)
GROUP BY 1, 2;

两条评论：

您正在假设第一笔薪水是最低的。并非总是如此。
这是假设工资变动表确实只有工资变动而其他JOIN没有引入新行。

Answer 4

恕我直言，整个概念存在重大问题。他们基于这样的假设：第一笔薪水始终是最低的，而最后一笔是高考，而薪水始终是向上的。这些都不是真的。 MAX可以用来获取最新信息的假设也导致这种结构充其量是很少见的。本节：

MAX(psc.amount/100) AS last_salary,
MAX(psc.reason) AS reason,
MAX(psc.updated_at) AS updated,

即使承认MAX psc.amount将获得正确的值，我们也可以假设日期也已正确选择。但是，由于某些原因，这并不适用，必须始终注意将文本的最大值与另一列结合使用。不需要最大文本对应于最大数量。

这是一个有趣的问题，因此以下内容解决了这些问题。视需要接受或忽略它。无论哪种方式，我都觉得很有趣。

--- create "tables" as cte
with profiles (id, company_id) as
     ( values (61,29)
            , (62,29)
            , (63,29)
            , (64,29)
     )
   , profile_salary_changes (profile_id, amount, reason, updated_at) as
     ( values (64, 640000, 'Initial Hire', '2018-02-15'::date)
            , (64, 611200, 'Salary cut 4.5% across the board Co wide.', '2018-07-05'::date)
            , (64, 625600, '50% July''s cut recovery', '2018-12-09'::date)
            , (64, 710000, 'Promotion', '2019-02-09'::date)            
            , (63, 630000, 'Initial  ', '2019-02-15'::date)
            , (63, 600000, 'Transfer.', '2019-07-05'::date)
            , (63, 627500, 'COL', '2019-12-09'::date)
            , (61, 100000, 'Initial Only', '2019-05-09'::date)  
            , (62, 620000, 'First', '2019-03-09'::date)  
            , (62, 625000, 'Bonus', '2019-08-09'::date)        
     )
   , users (profile_id, name) as
     ( values (61, 'Test1')
            , (62, 'Test2')
            , (63, 'Test3')
            , (64, 'Test4')
     )
-- final selection pick up designated columns 
select company_id
     , name 
     , reason
     , trunc(amount/100.0,2)        last_salary
     , updated_at  
     , trunc(prev_sal/100.0,2)      prev_salary
     , trunc(first_salary/100.0,2)  first_salary
  from ( -- pick up the appropriate previous and first salary
         select x.*
              , lag (amount)                  over( partition by company_id, profile_id order by updated_at ) prev_sal
              , lag (amount, (rn-1)::integer) over( partition by company_id, profile_id order by updated_at ) first_salary      
           from (
                 -- gather all columns, number each row for company and profile, and get number of total rows for each set 
                 select p.company_id
                      , u.name
                      , psc.*  
                      , row_number() over( partition by p.company_id, psc.profile_id order by updated_at) rn
                      , count(*)     over( partition by p.company_id, psc.profile_id) r
                   from profile_salary_changes psc
                   join profiles               p on (p.id = psc.profile_id)
                   join users                  u on (p.id = u.profile_id)
                ) x
         ) z
 -- select only the last row in each set. Additional salary values have been attached 
 where r = rn 
 order by name;

Answer 5

假设您的数据库结构如下：https://www.db-fiddle.com/f/i2BNYKhSaiu1xGPDPfeydr/1

您可以运行以下查询：

SELECT p.company_id,
    p.id,
    last_salary.amount/100 AS last_salary,
    last_salary.reason AS reason,
    last_salary.updated_at AS updated,
    prev_salary.updated_at as salary_before_updated_at,
    prev_salary.amount/100 AS salary_before
FROM profiles AS p
LEFT JOIN profile_salary_changes AS last_salary
    ON p.id = last_salary.profile_id 
        AND NOT EXISTS (
            SELECT * FROM profile_salary_changes psc2 
                WHERE psc2.profile_id = last_salary.profile_id 
                    AND psc2.updated_at > last_salary.updated_at)
LEFT JOIN profile_salary_changes AS prev_salary
    ON p.id = prev_salary.profile_id 
        AND prev_salary.updated_at != last_salary.updated_at 
        AND NOT EXISTS (
            SELECT * FROM profile_salary_changes psc3 
                WHERE psc3.profile_id = prev_salary.profile_id 
                    AND psc3.updated_at < last_salary.updated_at 
                    AND psc3.updated_at > prev_salary.updated_at)
WHERE p.company_id = 29;

如果您的id表中有一个像profile_salary_changes这样的主字段，最好在比较中使用它而不是updated_at

如何获得max列的行和第二高的值作为最后一列？

5 个答案: