总结许多产品的评级

时间:2015-11-16 14:11:21

标签: postgresql

我有一张表import numpy as np import matplotlib.pylab as plt dat = np.random.randn(10,10) plt.imshow(dat, interpolation='none') clb = plt.colorbar() clb.set_label('label', labelpad=-40, y=1.05, rotation=0) plt.show() 和一张表project_product。人们消费产品,可能会给产品评分从1到10。

请允许我使用一些ASCII艺术作为澄清:

project_consummation

现在我要概述一个产品的投票。当然,可能+--------------------------+ +-------------------------+ | project_product | | project_consummation | |--------------------------| |-------------------------| | id integer primary key |-\ | id integer primary key | | name varchar | \->| product_id integer | | ... | | rating integer | | various other fields... | | user_id integer | +--------------------------+ | ... | | various other fields... | +-------------------------+ 没有consummation值(例如NULL),因此必须忽略这些值。

结果应该如下所示(每个评分从1到10应该有自己的列,表明给出该评级产品的人数,以及评分总数rating以及之后的某些评分中位数,标准差等):

num_ratings

我创建了一个非常笨拙的“解决方案”,因为我为每个评级栏都做了 product_id | rating1 | rating2 | ... |rating10 | num_ratings ------------+---------+---------+-----+---------+------------- 1002 | | | ... | 1 | 1 1014 | 4 | | ... | 2 | 6 1015 | 2 | 1 | ... | 1 | 4 (我只会显示前3列,但我相信你会看到什么这变成一团糟:

LEFT OUTER JOIN

对于更好的代码,尤其是更好的性能,什么是更好的解决方案?

2 个答案:

答案 0 :(得分:2)

这样做:

select
        p.id product_id,
        count(case when c.rating = 1 then 1 else null end) rating1,
        count(case when c.rating = 2 then 1 else null end) rating2,
        count(case when c.rating = 3 then 1 else null end) rating3,
        count(case when c.rating = 4 then 1 else null end) rating4,
        count(case when c.rating = 5 then 1 else null end) rating5,
        count(case when c.rating = 6 then 1 else null end) rating6,
        count(case when c.rating = 7 then 1 else null end) rating7,
        count(case when c.rating = 8 then 1 else null end) rating8,
        count(case when c.rating = 9 then 1 else null end) rating9,
        count(case when c.rating = 10 then 1 else null end) rating10,
        count(c.rating) num_ratings
    from project_product p
    left join project_consummation c on c.product_id = p.id
        group by p.id
        order by p.id;

或更短的评级形式:

select
            p.id product_id,
            count(nullif(c.rating = 1, false)) rating1,
            count(nullif(c.rating = 2, false)) rating2,
            count(nullif(c.rating = 3, false)) rating3,
            count(nullif(c.rating = 4, false)) rating4,
            count(nullif(c.rating = 5, false)) rating5,
            count(nullif(c.rating = 6, false)) rating6,
            count(nullif(c.rating = 7, false)) rating7,
            count(nullif(c.rating = 8, false)) rating8,
            count(nullif(c.rating = 9, false)) rating9,
            count(nullif(c.rating = 10, false)) rating10,
            count(c.rating) num_ratings
        from project_product p
        left join project_consummation c on c.product_id = p.id
            group by p.id
            order by p.id;

答案 1 :(得分:1)

不完美......但希望你明白这个想法

使用案例

SELECT project_product.id,project_product.name 
     , sum(case when rating = 1 then 1 else 0 end ) as rating1
     , sum(case when rating = 2 then 1 else 0 end ) as rating2
     , sum(case when rating = 3 then 1 else 0 end ) as rating3
     , sum(case when rating = 4 then 1 else 0 end ) as rating4
     , sum(case when rating = 5 then 1 else 0 end ) as rating5
     , sum(case when rating = 6 then 1 else 0 end ) as rating6
     , sum(case when rating = 7 then 1 else 0 end ) as rating7
     , sum(case when rating = 8 then 1 else 0 end ) as rating8
     , sum(case when rating = 9 then 1 else 0 end ) as rating9
     , sum(case when rating = 10 then 1 else 0 end ) as rating10
  FROM project_product 
  LEFT JOIN project_consummation ON (project_product.id = project_consummation.product_id)
  GROUP BY project_product.id, project_product.name 

并使用交叉表:

-- if necessary:
-- CREATE EXTENSION tablefunc;

SELECT project_product.id,
       rating1, rating2, rating3, rating4, rating5,
       rating6, rating7, rating8, rating9, rating10, 
       rating1+rating2+rating3+rating4+rating5+
       rating6+rating7+rating8+rating9+rating10 as num_ratings
  FROM project_product
  LEFT JOIN crosstab(
       'select product_id, rating, count(*)
          from project_consummation
         group by product_id, rating
         order by product_id, rating ',
       'select generate_series(1, 10)')
       AS main (
         id integer, rating1 integer, rating2 integer, rating3 integer,
         rating4 integer, rating5 integer, rating6 integer,
         rating7 integer, rating8 integer, rating9 integer, rating10 integer
       )  ON (project_product.id = main.id )