如何在rails中查询查询结果(使用rails& postgres查询'DISTINCT ON'的结果

时间:2015-10-17 16:19:30

标签: ruby-on-rails postgresql

简短版本 我想查询另一个查询的结果,以便选择更有限的结果集。但是,添加where子句会重写第一个查询而不是处理结果,所以我得不到我需要的答案。

细节: 我有两个模型,检查和刻度。检查has_many ticks。

第一个查询使用DISTINCT ON并收集所有'检查'和所有相关的滴答,但只返回最近的滴答。我将其作为模型中的范围。

在我的控制器中,

  def checklist
  #Filter the results by scope or return all checks with latest tick
  case params[:filter]
    when "duebylastresult"
      @checks = Check.mostrecenttickonly.duebylastresult
    when "duebydate"
      @checks = Check.mostrecenttickonly.duebydate
    else
      @checks = Check.mostrecenttickonly
    end
  end

在模型中,第一个范围(工作):

scope :mostrecenttickonly, -> {
includes(:ticks)
.order("checks.id, ticks.created_at DESC")
.select("DISTINCT ON (checks.id) *").references(:ticks)
}

生成以下SQL:

  Parameters: {"filter"=>""}
  SQL (1.0ms)  SELECT DISTINCT ON (checks.id) *, 
"checks"."id" AS t0_r0, 
"checks"."area" AS t0_r1, "checks"."frequency" AS t0_r2, 
"checks"."showinadvance" AS t0_r3, "checks"."category" AS t0_r4, 
"checks"."title" AS t0_r5, "checks"."description" AS t0_r6, 
"checks"."created_at" AS t0_r7, "checks"."updated_at" AS t0_r8, 
"ticks"."id" AS t1_r0, "ticks"."result" AS t1_r1, 
"ticks"."comments" AS t1_r2, "ticks"."created_at" AS t1_r3, 
"ticks"."updated_at" AS t1_r4, "ticks"."check_id" AS t1_r5 
FROM "checks" LEFT OUTER JOIN "ticks" 
ON "ticks"."check_id" = "checks"."id"  
ORDER BY checks.id, ticks.created_at DESC

得到那个结果后,我想只显示值等于或大于3的刻度,所以范围:

   scope :duebylastresult, -> { where("ticks.result >= 3") }

生成SQL

  Parameters: {"filter"=>"duebylastresult"}
  SQL (1.0ms)  SELECT DISTINCT ON (checks.id) *, 
"checks"."id" AS t0_r0, 
"checks"."area" AS t0_r1, "checks"."frequency" AS t0_r2,
"checks"."showinadvance" AS t0_r3, "checks"."category" AS t0_r4, 
"checks"."title" AS t0_r5, "checks"."description" AS t0_r6, 
"checks"."created_at" AS t0_r7, "checks"."updated_at" AS t0_r8, 
"ticks"."id" AS t1_r0, "ticks"."result" AS t1_r1, 
"ticks"."comments" AS t1_r2, "ticks"."created_at" AS t1_r3, 
"ticks"."updated_at" AS t1_r4, "ticks"."check_id" AS t1_r5 
FROM "checks" LEFT OUTER JOIN "ticks" 
ON "ticks"."check_id" = "checks"."id" 
WHERE (ticks.result >= 3)  
ORDER BY checks.id, ticks.created_at DESC

正如我所知道的那样,WHERE语句在DISTINCT ON子句之前起作用,所以我现在有'最新刻度,结果是> = 3',而我正在寻找'最新刻度那么只有在哪里结果是> = 3'。

希望这是有道理的&提前谢谢!

编辑 - 我得到的内容和我需要的内容:

The Data:
Table Checks:
ID: 98 Title: Eire
ID: 99 Title: Land

Table Ticks:
ID: 1 CheckID: 98 Result:1 Date: Jan12
ID: 2 CheckID: 98 Result:5 Date: Feb12
ID: 3 CheckID: 98 Result:1 Date: Mar12
ID: 4 CheckID: 99 Result:4 Date: Apr12

First query returns the most recent result, like;
Check.ID: 98  Tick.ID: 3  Tick.Result: 1 Tick.Date: Mar12
Check.ID: 99  Tick.ID: 4  Tick.Result: 4 Tick.Date: Apr12

Second query currently returns the most recent result where the result is =>3, like;
Check.ID: 98  Tick.ID: 2  Tick.Result: 5 Tick.Date: Feb12
Check.ID: 99  Tick.ID: 4  Tick.Result: 5 Tick.Date: Apr12

When I really want:
Check.ID: 99  Tick.ID: 4  Tick.Result: 5 Tick.Date: Apr12

(ID 98 doesn't show as the last Tick.Result is 1).

2 个答案:

答案 0 :(得分:1)

您是否可以尝试以下操作,看看它是否以正确的方向开始:

    scope :just_a_test, -> {
    includes(:ticks)
    .order("checks.id")
    .where("ticks.created_at = (SELECT MAX(ticks.created_at) FROM ticks WHERE ticks.check_id = checks.id)")
    .where("ticks.result >= 3")
    .group("checks.id")
    }

答案 1 :(得分:0)

我不确定我是否真的理解:mostrecenttickonly范围,因为您只是加载支票。

话虽如此,如果你只想获得那些最近得分大于3的结账,我认为最好的方法是window function

<强> check.rb

...
  scope :duebylastresult, -> {
    find_by_sql(
      'SELECT *
       FROM (SELECT checks.*,
                    ticks.id AS tick_ids,
                    ticks.date AS tick_date,
                    ticks.result AS tick_result,
                    dense_rank() OVER (
                      PARTITION BY checks.id
                      ORDER BY ticks.date DESC
                    ) AS tick_rank
             FROM checks
             LEFT OUTER JOIN ticks ON checks.id = ticks.check_id) AS ranked_ticks
       WHERE tick_rank = 1 AND tick_result >= 3;'
    )
  }
...

基本上,我们只是加入检查和滴答表中的所有内容,然后添加另一个名为tick_rank的属性,该属性根据其date与其他行对结果集中的每一行进行排名相同的checks.id值。

SQL的工作方式是在评估WHERE字段之前评估谓词(SELECT子句中的条件),这意味着我们不能只写tick_rank = 1在这个声明中。

所以我们必须采取额外的步骤来包装结果(我们将其别名为ranked_ticks),然后只选择所有内容并将我们想要的谓词应用于此外部select语句。 tick_rank必须为1,这意味着它是最新的tick,结果必须为&gt; = 3.

编辑我正在使用我作为复习链接的那篇文章,因为我经常忘记SQL语法,但看了之后,我认为这会更有效(基本上只是等待加入{ {1}}直到分区完成后,我相信它会减少完整扫描次数:

checks