Postgres:内部加入AND条件在同一个字段上

时间:2015-01-11 11:26:02

标签: sql postgresql

一个Quiz可以有多个Submissions。我想要抓取至少一个关联QuizzesSubmission至少一个关联submissions.correct = tSubmission的所有submissions.correct = f

如何修复以下查询和WHERE语句以实现此目的:

SELECT quizzes.*,
       Count(submissions.id) AS submissions_count
FROM   "quizzes"
       INNER JOIN "submissions"
               ON "submissions"."quiz_id" = "quizzes"."id"
WHERE  ( submissions.correct = 'f' )
       AND ( submissions.correct = 't' )
GROUP  BY quizzes.id
ORDER  BY submissions_count ASC

更新

以下是缺少的信息:

我需要测验中的所有行数据。我只需要在查询中进行排序的计数(首先提交量最少的测验)。

k-voc_development=# \d quizzes;
                                         Table "public.quizzes"
       Column   |            Type             |                      Modifiers                       
    ------------+-----------------------------+------------------------------------------------------
     id         | integer                     | not null default nextval('quizzes_id_seq'::regclass)
     question   | character varying           | not null
     created_at | timestamp without time zone | not null
     updated_at | timestamp without time zone | not null
    Indexes:
        "quizzes_pkey" PRIMARY KEY, btree (id)
    Referenced by:
        TABLE "submissions" CONSTRAINT "fk_rails_04e433a811" FOREIGN KEY (quiz_id) REFERENCES quizzes(id)
        TABLE "answers" CONSTRAINT "fk_rails_431b8a33a3" FOREIGN KEY (quiz_id) REFERENCES quizzes(id)

    k-voc_development=# \d submissions;
                                         Table "public.submissions"
       Column   |            Type             |                        Modifiers                         
    ------------+-----------------------------+----------------------------------------------------------
     id         | integer                     | not null default nextval('submissions_id_seq'::regclass)
     quiz_id    | integer                     | not null
     correct    | boolean                     | not null
     created_at | timestamp without time zone | not null
     updated_at | timestamp without time zone | not null
    Indexes:
        "submissions_pkey" PRIMARY KEY, btree (id)
        "index_submissions_on_quiz_id" btree (quiz_id)
    Foreign-key constraints:
        "fk_rails_04e433a811" FOREIGN KEY (quiz_id) REFERENCES quizzes(id)

    k-voc_development=# 

4 个答案:

答案 0 :(得分:2)

-- I want to fetch all Quizzes
SELECT * FROM quizzes q
WHERE EXISTS ( -- that have at least one associated Submission with submissions.correct = t
    SELECT * FROM submissions s
    WHERE s.quiz_id = q.id AND s.correct = 't'
    )
AND EXISTS ( -- and at least one associated Submission with submissions.correct = f.
    SELECT * FROM submissions s
    WHERE s.quiz_id = q.id AND s.correct = 'f'
    );

答案 1 :(得分:2)

最佳解决方案取决于您的实施细节,数据分布和要求。

如果您的典型安装具有参照完整性(FK约束)并将submissions.correct定义为boolean NOT NULL,只需要quiz_id以及总提交,然后您根本不需要加入quizzes,这应该是最快的:

SELECT quiz_id, count(*) AS ct
FROM   submissions
-- WHERE  correct IS NOT NULL -- only relevant if correct can be NULL
GROUP  BY 1
HAVING bool_or(correct)
AND    bool_or(NOT correct);

专用aggregate function bool_or()对于使用布尔值的测试特别有用。比CASE表达式或类似结构更简单,更快。

许多其他技术,最佳解决方案取决于缺少的信息。

了解您的更新要求

  

我需要来自quizzes的所有行数据。我只需要计数订购   在查询中(首先提交的提交量最少的测验)。

如果很多的测验符合条件(总分的百分比很高),这应该是最快的。

SELECT q.*
FROM  (
   SELECT quiz_id, count(*) AS ct
   FROM   submissions
   GROUP  BY 1
   HAVING count(*) > count(correct OR NULL)
   ) s
JOIN   quizzes q ON q.id = s.quiz_id
ORDER  BY s.ct;

count(*) > count(correct OR NULL)有效,因为correctboolean NOT NULL。对于少数每个测验的提交,应该比上面的变体稍快一些。

答案 2 :(得分:1)

如果没有其他提交的值比t和f更正,那么这将起作用:

SELECT quizzes.*,
       Count(submissions.id) AS submissions_count
FROM   "quizzes"
       INNER JOIN "submissions"
               ON "submissions"."quiz_id" = "quizzes"."id"
GROUP  BY quizzes.id
HAVING COUNT(DISTINCT submissions.correct) >= 2
ORDER  BY submissions_count ASC 

答案 3 :(得分:1)

where条款移至Having条款Conditional Count聚合

SELECT quizzes.*,
       Count(submissions.id) AS submissions_count
FROM   "quizzes"
       INNER JOIN "submissions"
               ON "submissions"."quiz_id" = "quizzes"."id"
GROUP  BY quizzes.id
HAVING Count(CASE WHEN submissions.correct = 'f' THEN 1 END) >= 1
        and Count(CASE WHEN submissions.correct = 't' THEN 1 END) >= 1
ORDER  BY submissions_count ASC