使用postgres tablefunc crosstab()计算错误答案

时间:2019-06-12 18:11:26

标签: sql postgresql crosstab

我有一个视图(我们称其为“水果”),其中包含一列正确答案以及多项选择题考试中相关的错误答案,我想统计一下最常选择哪些错误答案(容易混淆)那些)。视图看起来像这样:

correct_answer | wrong_response
-------------------------------
apple          |   pear
apple          |   pear
apple          |   banana
banana         |   apple
banana         |   pear
banana         |   pear
banana         |   pear
pear           |   apple

我想要的是一个数据透视表,该表对正确答案的错误回答进行计数,以使列代表正确答案,而行则代表错误答案的计数。

wrong_response | apple | banana | pear
---------------------------------------
apple          | 0     | 1      | 1
banana         | 1     | 0      | 0
pear           | 2     | 3      | 0

我曾经在这里before使用过此功能,但那时候我并没有在努力计算。任何帮助将不胜感激!

编辑:对于将来的人们,这两种解决方案都可以工作!但是,条件聚合更为灵活。仅当查询中包含所有可能性时,交叉表解决方案才有效。例如,如果排除梨(或添加猕猴桃),则交叉表解决方案将返回错误。无论是否排除(或添加当前不存在的)记录,条件汇总都将返回结果。谢谢您的帮助。

2 个答案:

答案 0 :(得分:1)

如果您知道这些列,则可以使用条件聚合:

select wrong_response,
       count(*) filter (where correct_answer = 'apple') as apple,
       count(*) filter (where correct_answer = 'pear') as pear,
       count(*) filter (where correct_answer = 'banana') as banana
from t
group by wrong_response;

答案 1 :(得分:1)

假设您已经完成:CREATE EXTENSION tablefunc;

然后通过crosstab()函数实现所需的目标是:

SELECT *
FROM crosstab('SELECT wrong_response,
                      correct_answer,
                      count(*)
               FROM fruit
               GROUP BY wrong_response, correct_answer 
               ORDER BY wrong_response',

              'SELECT correct_answer
               FROM fruit
               GROUP BY correct_answer
               ORDER BY correct_answer')

AS (wrong_answer varchar(20),
    apple bigint,
    banana bigint,
    pear bigint);

以上代码将为您提供以下结果: enter image description here

请注意,此处0输出为null,为了获得所需的内容,您只需要稍微修改select

SELECT
    wrong_answer,
    coalesce(apple, 0) as apple,
    coalesce(banana, 0) as banana,
    coalesce(pear, 0) as pear
FROM crosstab('SELECT wrong_response,
                      correct_answer,
                      count(*)
               FROM fruit
               GROUP BY wrong_response, correct_answer 
               ORDER BY wrong_response',

              'SELECT correct_answer
               FROM fruit
               GROUP BY correct_answer
               ORDER BY correct_answer')

AS (wrong_answer varchar(20),
    apple bigint,
    banana bigint,
    pear bigint)

以上将使您达到所需的状态:

enter image description here