在内连接中如何根据滑动条件从右表中只选择一行?

时间:2013-04-29 03:46:36

标签: sql postgresql postgresql-9.2

我有一组表格,其中包含周,产品,库存和每周预测,我想从中选择第X周产品库存和最新预测。但我无法理解SQL:

create table products (
    product_id integer
);
create table inventory (
    product_id integer,
    asof_week integer,
    qoh float8
);
create table forecast (
    product_id integer,
    for_week integer,
    asof_week integer,
    projection float8
);
create table weeks (
    wkno integer
);
insert into weeks values (4),(5),(6),(7);
insert into products values(1),(2);
insert into inventory values(1,5,10),(1,6,20),(2,6,200);
insert into forecast values(1,4,1,10),(1,4,2,11),(1,4,3,12),(1,4,4,13),
                           (1,5,1,11),(1,5,2,11),(1,5,3,21),(1,5,4,31),
--corr:one too many        (1,6,1,10),(1,6,2,11),(1,6,3,12),(1,6,4,22),(1,6,5,32),(1,6,5,42),(1,6,6,42),
                           (1,6,1,10),(1,6,2,11),(1,6,3,12),(1,6,4,22),(1,6,5,42),(1,6,6,42),
                           (1,7,1,10),(1,7,6,16),
                           (2,6,5,2000),(2,7,5,2100),(2,8,5,30);

一个查询:

select p.product_id "product",
        i.asof_week "inven asof",
        i.qoh "qoh",
        f.for_week "fcast for",
        f.projection "fcast qty",
        f.asof_week "fcast asof"
from weeks w, products p
    left join inventory i on(p.product_id = i.product_id)
    left join forecast f on(p.product_id = f.product_id)
where
    (i.asof_week is null or i.asof_week = w.wkno)
    and (f.for_week is null or f.for_week = w.wkno)
    and (f.asof_week is null
        or f.asof_week = (select max(f2.asof_week)
                          from forecast f2
                          where f2.product_id = f.product_id
                             and f2.for_week = f.for_week))
order by p.product_id, i.asof_week, f.for_week, f.asof_week

例如,对于4-7周,我正在寻找结果集:

product week    qoh     projection
1       4       -       13
1       5       10      31
1       6       20      42
1       7       -       16
2       6       200     2000
2       7       -       2100

但实际上我只有3行:

 product | inven asof | qoh | fcast for | fcast qty | fcast asof 
---------+------------+-----+-----------+-----------+------------
       1 |          5 |  10 |         5 |        31 |          4
       1 |          6 |  20 |         6 |        42 |          6
       2 |          6 | 200 |         6 |      2000 |          5
(3 rows)
Time: 2.531 ms

我是SQL新手,可以使用一些有用的指针。

关于数据的一些注意事项:我有几个其他数据表要加入,我在示例中省略了这个问题,专注于这个问题,其中至少有一个与预测数量表类似(即多个版本行)每个产品x周)。 X周每个产品大约有100个预测行,所以在某个地方我还要担心效率......但首先要得到正确的结果。

我在postgresql 9.2上。

感谢。

3 个答案:

答案 0 :(得分:2)

在不知道其余数据模型的情况下很难给出一般指针,但我应该这样说:我通常会发现,当我将它们保持为“平坦”时,查询更容易推理。此外,只要我有一堆空检查,我就会尝试为我的数据添加保证,或者在不同的“根”表周围重新调整我的查询。

无论如何,the following应该适合你(虽然我不能保证它适用于任何数据,特别是在存在重复的情况下):

select
  products.product_id,
  weeks.wkno,
  inventory.qoh,
  max(projection)
from forecast
join products on products.product_id = forecast.product_id
join weeks on weeks.wkno = forecast.for_week
left join inventory on
  inventory.product_id = products.product_id
  and inventory.asof_week = weeks.wkno
group by
  products.product_id,
  weeks.wkno,
  inventory.qoh

抱歉,我不能给你那么多建议。我希望这会有所帮助。

修改:调整查询以删除交叉联接。原始版本here。如果您想离开联接预测,如果有些人遗失,您可能需要交叉加入。对于您的具体示例,它是不需要的。

编辑2 :上述查询在语义上不正确。 following是正确的,但不能说明我的观点。

select
  p.product_id,
  p.wkno,
  p.qoh,
  f.projection
from
  (select
      products.product_id,
      weeks.wkno,
      inventory.qoh,
      max(forecast.asof_week) max_p
    from forecast
    join products on products.product_id = forecast.product_id
    join weeks on weeks.wkno = forecast.for_week
    left join inventory on
      inventory.product_id = products.product_id
      and inventory.asof_week = weeks.wkno
    group by
      products.product_id,
          weeks.wkno,
      inventory.qoh) as p
  join forecast f on
    f.product_id = p.product_id
    and  f.for_week = p.wkno
    and f.asof_week = p.max_p

答案 1 :(得分:1)

数据中似乎缺少一些PK / FK约束:

CREATE TABLE products (
    product_id INTEGER PRIMARY KEY
    );
CREATE TABLE weeks (
    wkno INTEGER PRIMARY KEY
    );
CREATE TABLE inventory (
    product_id INTEGER REFERENCES products(product_id)
    , asof_week INTEGER REFERENCES weeks(wkno)
    , qoh float8
    , PRIMARY KEY (product_id,asof_week)
    );
CREATE TABLE forecast (
    product_id INTEGER REFERENCES products(product_id)
    , for_week INTEGER REFERENCES weeks(wkno)
    , asof_week INTEGER REFERENCES weeks(wkno)
    , projection FLOAT8
    , PRIMARY KEY (product_id,for_week,asof_week)
    );
INSERT INTO weeks VALUES (4),(5),(6),(7)
    , (1),(2),(3), (8) -- need these, too
    ;
-- et cetera.

如果weeks表用作“日历”表,则可以(并且)替换为generate_series(4,7)伪表。 (并且FK约束下降)

查询受LEFT JOIN + MAX(聚合)构造的影响很大。以下应该做同样的事情,看起来更简单(NOT EXISTS救援......):

SELECT p.product_id "product"
        , i.asof_week "inven asof"
        , i.qoh "qoh"
        , f.for_week "fcast for"
        , f.projection "fcast qty"
        , f.asof_week "fcast asof"
FROM products p
CROSS JOIN weeks w
LEFT JOIN inventory i ON i.product_id = p.product_id AND i.asof_week = w.wkno
LEFT JOIN forecast f ON f.product_id = p.product_id AND f.for_week = w.wkno
WHERE NOT EXISTS (
    SELECT * FROM forecast f2
    WHERE f2.product_id = f.product_id
      AND f2.for_week = f.for_week
    AND  f2.asof_week < f.asof_week
    )
AND COALESCE(i.asof_week,f.for_week) IS NOT NULL
ORDER BY p.product_id, i.asof_week, f.for_week, f.asof_week
    ;

答案 2 :(得分:0)

感谢Julien的提示。这得到了结果,虽然我不确定这是最好的方法,或者一旦我有100多万行,它将如何运作,因为我仍在使用玩具数据集。可能第一件坏事是下面的pw没有索引。

with pw as ( select * from products, weeks )         
    select pw.product_id "product",
            pw.wkno,         
            i.asof_week "inven asof",    
            coalesce(i.qoh::text,'missing') "qoh",   
            f.for_week "fcast for",       
            coalesce(f.projection::text,'no fcast') "fcast qty",      
            f.asof_week "fcast asof"     
    from pw      
        left join inventory i on(pw.product_id = i.product_id and pw.wkno = i.asof_week )
        left join forecast f on(pw.product_id = f.product_id  
                                and f.for_week = pw.wkno          
                                and f.asof_week = (select max(f2.asof_week)               
                                                from forecast f2                          
                                                where f2.product_id = pw.product_id                       
                                                    and f2.asof_week < pw.wkno                            
                                                    and f2.for_week = pw.wkno))                               
    where        
        not (i.asof_week is null and f.asof_week is null)     
    order by pw.product_id,  
                pw.wkno,     
                f.for_week,                          
                f.asof_week           

产生

 product | wkno | inven asof |   qoh   | fcast for | fcast qty | fcast asof
---------+------+------------+---------+-----------+-----------+------------
       1 |    4 |            | missing |         4 | 12        |          3
       1 |    5 |          5 | 10      |         5 | 31        |          4
       1 |    6 |          6 | 20      |         6 | 42        |          5
       1 |    7 |            | missing |         7 | 16        |          6
       2 |    6 |          6 | 200     |         6 | 2000      |          5
       2 |    7 |            | missing |         7 | 2100      |          5
(6 rows)
Time: 2.999 ms