每个父级有多个连接和最后N行

时间:2017-01-15 17:53:35

标签: php mysql greatest-n-per-group

我有3张桌子。

companies
- id
- name
- user_id

departments
- id
- name
- user_id
- company_id

invoices
- id
- department_id
- price
- created_at

我正在尝试获取"仪表板所需的所有数据"屏幕在1大mysql查询中用于性能目的。值得一提的是,发票表有700k记录,并且只会继续增加。

因此,我需要获得所有用户的公司,部门和每个部门的最后2张发票(每个ID的最高日期为2)。

现在我不会遇到前两个问题,我可以轻松地做到这一点,例如:

SELECT companies.id as company_id, companies.name as company_name, departments.id as department_id, departments.name as department_name
FROM companies
LEFT JOIN departments
ON companies.id = departments.company_id
WHERE companies.user_id = 1

我正在努力获取每个部门的最新2张发票。在同一个查询中最好的方法是什么?

此处提供的数据和SQL Fiddle相同。

CREATE TABLE `companies` (
  `id` int(10) UNSIGNED NOT NULL,
  `name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `user_id` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

CREATE TABLE `departments` (
  `id` int(10) UNSIGNED NOT NULL,
  `name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `user_id` int(11) NOT NULL,
  `company_id` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

CREATE TABLE `invoices` (
  `id` int(10) UNSIGNED NOT NULL,
  `price` decimal(6,2)  NOT NULL,
  `created_at` timestamp NULL DEFAULT NULL,
  `department_id` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

ALTER TABLE `companies`
  ADD PRIMARY KEY (`id`);

ALTER TABLE `departments`
  ADD PRIMARY KEY (`id`);

ALTER TABLE `invoices`
  ADD PRIMARY KEY (`id`);

ALTER TABLE `companies`
  MODIFY `id` int(10) UNSIGNED NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=1;

ALTER TABLE `departments`
  MODIFY `id` int(10) UNSIGNED NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=1;

ALTER TABLE `invoices`
  MODIFY `id` int(10) UNSIGNED NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=1;

INSERT INTO companies
  (`name`, `user_id`)
VALUES
  ('Google', 1),
  ('Apple', 1),
  ('IBM', 1)
;

INSERT INTO departments
  (`name`, `user_id`, `company_id`)
VALUES
  ('Billing', 1, 1),
  ('Support', 1, 1),
  ('Tech', 1, 1),
  ('Billing Dept', 1, 2),
  ('Support Dept', 1, 2),
  ('Tech Dept', 1, 2),
  ('HR', 1, 3),
  ('IT', 1, 3),
  ('Executive', 1, 3)
;

INSERT INTO invoices
  (`price`, `created_at`, `department_id`)
VALUES
  (155.23, '2016-04-07 14:39:29', 1),
  (123.23, '2016-04-07 14:40:26', 1),
  (150.50, '2016-04-07 14:40:30', 1),
  (123.23, '2016-04-07 14:41:38', 1),
  (432.65, '2016-04-07 14:44:15', 1),
  (323.23, '2016-04-07 14:44:22', 2),
  (541.43, '2016-04-07 14:44:33', 2),
  (1232.23, '2016-04-07 14:44:36', 2),
  (433.42, '2016-04-07 14:44:37', 2),
  (1232.43, '2016-04-07 14:44:39', 2),
  (850.40, '2016-04-07 14:44:46', 3),
  (133.32, '2016-04-07 14:45:11', 3),
  (12.43, '2016-04-07 14:45:15', 3),
  (154.23, '2016-04-07 14:45:25', 3),
  (132.43, '2016-04-07 14:46:01', 3),
  (859.55, '2016-04-07 14:53:11', 4),
  (123.43, '2016-04-07 14:53:45', 4),
  (433.33, '2016-04-07 14:54:14', 4),
  (545.12, '2016-04-07 14:54:54', 4),
  (949.99, '2016-04-07 14:55:10', 4),
  (1112.32, '2016-04-07 14:53:40', 5),
  (132.32, '2016-04-07 14:53:44', 5),
  (42.43, '2016-04-07 14:53:48', 5),
  (545.34, '2016-04-07 14:53:56', 5),
  (2343.32, '2016-04-07 14:54:05', 5),
  (3432.43, '2016-04-07 14:54:02', 6),
  (231.32, '2016-04-07 14:54:22', 6),
  (1242.33, '2016-04-07 14:54:54', 6),
  (232.32, '2016-04-07 14:55:12', 6),
  (43.12, '2016-04-07 14:55:23', 6),
  (4343.23, '2016-04-07 14:55:24', 7),
  (1123.32, '2016-04-07 14:55:31', 7),
  (4343.32, '2016-04-07 14:55:56', 7),
  (354.23, '2016-04-07 14:56:04', 7),
  (867.76, '2016-04-07 14:56:12', 7),
  (45.76, '2016-04-07 14:55:54', 8),
  (756.65, '2016-04-07 14:56:08', 8),
  (153.74, '2016-04-07 14:56:14', 8),
  (534.86, '2016-04-07 14:56:23', 8),
  (867.65, '2016-04-07 14:56:55', 8),
  (433.56, '2016-04-07 14:56:32', 9),
  (1423.43, '2016-04-07 14:56:54', 9),
  (342.56, '2016-04-07 14:57:11', 9),
  (343.75, '2016-04-07 14:57:23', 9),
  (1232.43, '2016-04-07 14:57:34', 9)
;

这是预期的结果。

company_id| company_name| department_id | department_name | invoice_price | invoice_created_at
         1| Google      |             1 | Billing         |        123.23 | 2016-04-07 14:41:38 | 
         1| Google      |             1 | Billing         |        432.65 | 2016-04-07 14:44:15 | 
         1| Google      |             2 | Support         |        433.42 | 2016-04-07 14:44:37 | 
         1| Google      |             2 | Support         |       1232.43 | 2016-04-07 14:44:39 | 
         1| Google      |             3 | Tech            |        154.23 | 2016-04-07 14:45:25 | 
         1| Google      |             3 | Tech            |        132.43 | 2016-04-07 14:46:01 | 
         2| Apple       |             4 | Billing Dept    |        545.12 | 2016-04-07 14:54:54 | 
         2| Apple       |             4 | Billing Dept    |        949.99 | 2016-04-07 14:55:10 | 
         2| Apple       |             5 | Support Dept    |        545.34 | 2016-04-07 14:53:56 | 
         2| Apple       |             5 | Support Dept    |       2343.32 | 2016-04-07 14:54:05 | 
         2| Apple       |             6 | Tech Dept       |        232.32 | 2016-04-07 14:55:12 | 
         2| Apple       |             6 | Tech Dept       |         43.12 | 2016-04-07 14:55:23 | 
         3| IBM         |             7 | HR              |        354.23 | 2016-04-07 14:56:04 | 
         3| IBM         |             7 | HR              |        867.76 | 2016-04-07 14:56:12 | 
         3| IBM         |             8 | IT              |        534.86 | 2016-04-07 14:56:23 | 
         3| IBM         |             8 | IT              |        867.65 | 2016-04-07 14:56:55 | 
         3| IBM         |             9 | Executive       |        343.75 | 2016-04-07 14:57:23 | 
         3| IBM         |             9 | Executive       |       1232.43 | 2016-04-07 14:57:34 |

2 个答案:

答案 0 :(得分:2)

我不得不承认,我对你的结果集与你的描述和数据集的匹配程度有点挣扎,但这里有些东西可以玩......

SELECT x.price
     , x.created_at
     , x.department_id
     , x.department
     , x.department_user 
     , x.company_id
     , x.company
     , x.company_user 
  FROM 
     ( SELECT i.id
            , i.price
            , i.created_at
            , i.department_id
            , d.name department
            , d.user_id department_user 
            , d.company_id
            , c.name company
            , c.user_id company_user
            , CASE WHEN @prev=department_id THEN @i:=@i+1 ELSE @i:=1 END i
            , @prev := i.department_id
         FROM invoices i 
         JOIN departments d 
           ON d.id = i.department_id 
         JOIN companies c 
           ON c.id = d.company_id
         JOIN (SELECT @prev:=null, @i:=0) vars
        ORDER 
           BY department_id
            , created_at DESC
     ) x
 WHERE i<=2;

这是一种较慢的概念化同一想法的方式(我已经忽略了相关性较低的位)...

SELECT x.* 
  FROM invoices x 
  JOIN invoices y 
    ON y.department_id = x.department_id 
   AND y.created_at <= x.created_at 
 GROUP 
    BY x.department_id
     , x.created_at
HAVING COUNT(*) <=2;

答案 1 :(得分:1)

一个想法是在invoices

中加入一个JOIN
LEFT JOIN invoices i ON  i.department_id = departments.id

这样您就可以获得每个部门的所有发票。但是你需要将它们限制在每个部门的最后两个。一种方法是使用LIMIT 2的相关子查询的附加IN条件

LEFT JOIN invoices i
  ON  i.department_id = departments.id
  AND i.id IN (
    SELECT i1.id
    FROM invoices i1
    WHERE i1.department_id = departments.id
    ORDER BY i1.id DESC
    LIMIT 2
  )

但由于一些奇怪的原因,MySQL不允许在LIMIT语句中使用IN。所以我们需要更加棘手并避免IN条件。相反,我们可以使用>=并使用LIMIT 1 OFFSET 1选择第二高的ID:

  AND i.id >= (
    SELECT i1.id
    FROM invoices i1
    WHERE i1.department_id = departments.id
    ORDER BY i1.id DESC
    LIMIT 1
    OFFSET 1
  )

现在最后一个问题:如果只有一张发票,我们就找不到第二张发票了。子查询将重新调整NULL,并且条件将始终失败。在这种情况下,我们使用0COALESCE替换为NULL。

所以最后的查询看起来像是:

SELECT companies.id as company_id,
       companies.name as company_name,
       departments.id as department_id,
       departments.name as department_name,
       i.id as invoice_id,
       i.price as invoice_price
FROM companies
LEFT JOIN departments
  ON companies.id = departments.company_id
LEFT JOIN invoices i
  ON  i.department_id = departments.id
  AND i.id >= COALESCE((
    SELECT i1.id
    FROM invoices i1
    WHERE i1.department_id = departments.id
    ORDER BY i1.id DESC
    LIMIT 1
    OFFSET 1
  ), 0)
WHERE companies.user_id = 1

http://sqlfiddle.com/#!9/8a956/14