获取最新的儿童记录,无需订

时间:2014-09-22 15:06:57

标签: sql ruby-on-rails postgresql greatest-n-per-group has-many

简化,我得到以下情况。我有两张桌子。一次迁移通过checks.migration_id进行多次检查。列checks.old描述了一种检查。现在,我希望每次迁移时检查old为真(query1)和false(query2)的最大时间。

大约有30,000次迁移,每次有大约1000次检查,其中old = true,1000次检查old = false。表检查将变得非常极端。检查的顺序没有给出,可能完全混淆。

我希望同时获得最多150次迁移的最新检查。

SQL小提琴:http://sqlfiddle.com/#!15/282ce/15

我使用PostgreSQL 9.3和Rails 3.2(不应该这么重要)

什么是最有效的方法来获取最新的子记录,其中old = true?

表格迁移:

| ID |
|----|
|  1 |
|  2 |

表格检查:

| ID | MIGRATION_ID | OLD | OK |                             TIME |
|----|--------------|-----|----|----------------------------------|
|  1 |            1 |   1 |  1 | September, 22 2014 12:00:01+0000 |
|  2 |            1 |   0 |  1 | September, 22 2014 12:00:02+0000 |
|  3 |            2 |   1 |  1 | September, 22 2014 12:00:01+0000 |
|  4 |            2 |   0 |  1 | September, 22 2014 12:00:02+0000 |
|  5 |            1 |   1 |  1 | September, 22 2014 12:00:03+0000 |
|  6 |            1 |   0 |  1 | September, 22 2014 12:00:04+0000 |
|  7 |            2 |   1 |  1 | September, 22 2014 12:00:03+0000 |
|  8 |            2 |   0 |  1 | September, 22 2014 12:00:04+0000 |

查询1应返回以下结果:

| Migration.id | Check_ID | OLD | OK |                             TIME |
|--------------|----------|-----|----|----------------------------------|
|      1       |     5    |   1 |  1 | September, 22 2014 12:00:03+0000 |
|      2       |     7    |   1 |  1 | September, 22 2014 12:00:03+0000 |

查询1应返回以下结果:

| Migration.id | Check_ID | OLD | OK |                             TIME |
|--------------|----------|-----|----|----------------------------------|
|      1       |     6    |   0 |  1 | September, 22 2014 12:00:04+0000 |
|      2       |     8    |   0 |  1 | September, 22 2014 12:00:04+0000 |

我尝试在子查询中使用max来解决它,但后来我丢失了有关checks.ok和check.time的信息。

SELECT eq.id, (SELECT max(checks.id) FROM checks WHERE checks.migration_id = eq.id and checks.old = 't') AS latest FROM  migrations eq;
SELECT eq.id, (SELECT max(checks.id) FROM checks WHERE checks.migration_id = eq.id and checks.old = 'f') AS latest FROM  migrations eq;

(我知道我得到max(id)而不是max(time)。)

在Rails中,我尝试为每个迁移获取最新的记录,这导致1 + n问题。我无法包含所有支票,因为有很多支票。

1 个答案:

答案 0 :(得分:1)

具有Postgres特定DISTINCT ON简单解决方案:

查询1 (“对于每次迁移,检查时time最大的old是真的”{/ p>

SELECT DISTINCT ON (migration_id)
       migration_id, id AS check_id, old, ok, time
FROM   checks
WHERE  old
ORDER  BY migration_id, time DESC;

反转查询2 WHERE条件:

...
WHERE  NOT old
...

详细说明:

但是,如果您希望更好地阅读大表的性能,请使用JOIN LATERAL(Postgres 9.2 +,标准SQL),构建多列索引,如:

CREATE INDEX checks_special_idx ON checks(old, migration_id, time DESC);

查询1

SELECT m.id AS migration_id
     , c.id AS check_id, c.old, c.ok, c.time
FROM   migrations m
-- FROM   (SELECT id FROM migrations LIMIT 150) m
JOIN   LATERAL (
     SELECT id, old, ok, time
     FROM   checks
     WHERE  migration_id = m.id
     AND    old
     ORDER  BY time DESC
     LIMIT  1
     ) c ON TRUE;

再次在old上切换条件以进行查询2.
对于未指定的“最多150次迁移”,请使用注释的替代行。

详细说明:

SQL Fiddle.

除此之外:不要使用“时间”作为标识符。这是Postgres中的reserved word in standard SQL和基本类型名称。