从()的每个分区中选择不同的值

时间:2017-07-24 07:29:55

标签: postgresql distinct row-number

我有一个查询,我根据他们的难度级别想要不同类别的问题。

有些问题与其他一些问题相似(我将他们的联系存储在一个名为“桶”的字段中)。

现在,我想要的是只应该从一个桶中返回一个问题。

我正在尝试的查询是:

select *
            from (
                select distinct q.bucket,
                    row_number() over (partition by dl.value order by random()) as rn,
                    dense_rank() over (partition by dl.value, LOWER(qc.value) = LOWER('general') order by random()) as rnc,
                    dl.value, qc.value as question_category,
                    q.question_text, q.option_a, q.option_b, q.option_c, q.option_d,
                    q.correct_answer, q.image_link, q.question_type
                from
                    questions_bank q
                    inner join
                    question_category qc on qc.id = q.question_category_id
                    inner join
                    sports_type st on st.id = q.sports_type_id
                    inner join
                    difficulty_level dl on dl.id = q.difficulty_level_id
                where st.game_type = lower('cricket') and dl.value in ('E','M','H')
            ) s
            where
                (value = 'E' and rnc <= 6 and LOWER(question_category) != LOWER('general')) or
                (value = 'E' and rnc <= 6 and LOWER(question_category) = LOWER('general')) or
                value = 'M' and rn <= 0 or
                value = 'H' and rn <= 0;

这不会返回所需的输出。

相同的输出是:

bucket | rn | rnc | value | question_category | question_text | option_a | option_b | option_c | option_d | correct_answer |                 image_link                  | question_type 

  2 |  2 |   2 | E     | General           | abs           | a        | b        | c        | d        | option_a       | https://d1ugevkr3ygvej.cloudfront.net/2.png | i
  3 |  3 |   3 | E     | General           | abcd          | a        | b        | c        | d        | option_a       | https://d1ugevkr3ygvej.cloudfront.net/3.png | i
  3 |  4 |   4 | E     | General           | abs           | a        | b        | c        | d        | option_a       |                                             | t
  4 |  1 |   1 | E     | General           | image         | a        | b        | c        | d        | option_a       |                                             | t

如果您注意到,则存储区值包含3作为重复值。我不希望row_number和bucket的组合是不同的。应优先考虑存储桶,然后应计算行数,但分区应基于question_category值。

我该如何解决这个问题?

1 个答案:

答案 0 :(得分:0)

解决方案不是使用DISTINCT,而是

SELECT DISTINCT ON (q.bucket) ...

请参阅the documentation

这将每q.bucket只返回一行,如果您向查询添加ORDER BY子句,它将按顺序选择第一行(否则您将获得“第一行” “排”。