SQL查询以获取列中最普遍的值

时间:2013-09-07 12:42:13

标签: sql postgresql

我有两张表格如下 -

销售记录:

    Date    |   Customer   |    ItemSold 
-----------------------------------------
11/01/2013  |     Alex     |     Pen
12/01/2013  |     Rony     |     Paper
13/01/2013  |     Alex     |     Eraser
14/01/2013  |     Marty    |     Eraser
15/01/2013  |     Alex     |     Pen
16/01/2013  |     Rob      |     Paper
17/01/2013  |     Alex     |     Pencil
18/01/2013  |     Alex     |     Pen
19/01/2013  |     Ned      |     Pen
20/01/2013  |     Alex     |     Paper
21/01/2013  |     Alex     |     Pencil
22/01/2013  |     Ned      |     Pen
23/01/2013  |     Alex     |     Eraser
24/01/2013  |     Alex     |     Pen
25/01/2013  |     Alex     |     Pen
26/01/2013  |     Alex     |     Paper
27/01/2013  |     Ned      |     Paper
28/01/2013  |     Alex     |     Pen
29/01/2013  |     Alex     |     Eraser
30/01/2013  |     Alex     |     Pen
31/01/2013  |     Rony     |     Pencil
01/02/2013  |     Alex     |     Eraser
02/02/2013  |     Ned      |     Paper
03/02/2013  |     Alex     |     Pen

优先级:

ItemName    |    Priority
--------------------------
Pen         |       1
Paper       |       2
Pencil      |       3
Eraser      |       4

我想获得一个列表,知道哪个客户可能会购买以下内容 -

Name   |   Item
----------------
Alex   |   Pen
Rob    |   Paper
Ned    |   Pen
Marty  |   Eraser
Rony   |   Paper

如果与项目有关联,则应选择具有最高优先级的项目。 Ned每次购买Pen和Paper两次,但应选择Pen,因为它比纸张更优先。

对此的SQL查询是什么?

3 个答案:

答案 0 :(得分:1)

从统计数据来看,您所寻找的术语是mode。以下是使用窗口/分析函数计算它的一种方法:

select customer, ItemSold
from (select customer, ItemSold, count(*),
             row_number() over (partition by customer order by count(*) desc, p.priority
                               ) as seqnum
      from sales s left outer join
           priority p
           on s.ItemSold = p.ItemName
      group by customer, ItemSold
     ) ci
where seqnum = 1;

答案 1 :(得分:1)

SQL Fiddle

select distinct on (customer)
    customer, itemsold, total
from
    (
        select customer, itemsold, count(*) total
        from sales
        group by customer, itemsold
    ) s
    inner join priority on itemsold = itemname
order by customer, total desc, priority

答案 2 :(得分:0)

我认为这是最快的方法。请注意,我在order by中使用了优先级,但未在group by中使用它 - 如果您具有从itemnamepriority的功能依赖性,则PostgreSQL允许在Priority表中:

select distinct on (s.customer)
    s.customer, p.itemname, count(*) as total
from sales as s
    inner join priority as p on p.itemname = s.itemsold
group by s.customer, p.itemname
order by s.customer, total desc, p.priority

如果不可能,您可以使用此查询:

select distinct on (s.customer)
    s.customer, s.itemsold, count(*) as total
from sales as s
    inner join priority as p on p.itemname = s.itemsold
group by s.customer, s.itemsold, p.priority
order by s.customer, total desc, p.priority;

sql fiddle demo