我有一组表格,其中包含周,产品,库存和每周预测,我想从中选择第X周产品库存和最新预测。但我无法理解SQL:
create table products (
product_id integer
);
create table inventory (
product_id integer,
asof_week integer,
qoh float8
);
create table forecast (
product_id integer,
for_week integer,
asof_week integer,
projection float8
);
create table weeks (
wkno integer
);
insert into weeks values (4),(5),(6),(7);
insert into products values(1),(2);
insert into inventory values(1,5,10),(1,6,20),(2,6,200);
insert into forecast values(1,4,1,10),(1,4,2,11),(1,4,3,12),(1,4,4,13),
(1,5,1,11),(1,5,2,11),(1,5,3,21),(1,5,4,31),
--corr:one too many (1,6,1,10),(1,6,2,11),(1,6,3,12),(1,6,4,22),(1,6,5,32),(1,6,5,42),(1,6,6,42),
(1,6,1,10),(1,6,2,11),(1,6,3,12),(1,6,4,22),(1,6,5,42),(1,6,6,42),
(1,7,1,10),(1,7,6,16),
(2,6,5,2000),(2,7,5,2100),(2,8,5,30);
一个查询:
select p.product_id "product",
i.asof_week "inven asof",
i.qoh "qoh",
f.for_week "fcast for",
f.projection "fcast qty",
f.asof_week "fcast asof"
from weeks w, products p
left join inventory i on(p.product_id = i.product_id)
left join forecast f on(p.product_id = f.product_id)
where
(i.asof_week is null or i.asof_week = w.wkno)
and (f.for_week is null or f.for_week = w.wkno)
and (f.asof_week is null
or f.asof_week = (select max(f2.asof_week)
from forecast f2
where f2.product_id = f.product_id
and f2.for_week = f.for_week))
order by p.product_id, i.asof_week, f.for_week, f.asof_week
例如,对于4-7周,我正在寻找结果集:
product week qoh projection
1 4 - 13
1 5 10 31
1 6 20 42
1 7 - 16
2 6 200 2000
2 7 - 2100
但实际上我只有3行:
product | inven asof | qoh | fcast for | fcast qty | fcast asof
---------+------------+-----+-----------+-----------+------------
1 | 5 | 10 | 5 | 31 | 4
1 | 6 | 20 | 6 | 42 | 6
2 | 6 | 200 | 6 | 2000 | 5
(3 rows)
Time: 2.531 ms
我是SQL新手,可以使用一些有用的指针。
关于数据的一些注意事项:我有几个其他数据表要加入,我在示例中省略了这个问题,专注于这个问题,其中至少有一个与预测数量表类似(即多个版本行)每个产品x周)。 X周每个产品大约有100个预测行,所以在某个地方我还要担心效率......但首先要得到正确的结果。
我在postgresql 9.2上。
感谢。
答案 0 :(得分:2)
在不知道其余数据模型的情况下很难给出一般指针,但我应该这样说:我通常会发现,当我将它们保持为“平坦”时,查询更容易推理。此外,只要我有一堆空检查,我就会尝试为我的数据添加保证,或者在不同的“根”表周围重新调整我的查询。
无论如何,the following应该适合你(虽然我不能保证它适用于任何数据,特别是在存在重复的情况下):
select
products.product_id,
weeks.wkno,
inventory.qoh,
max(projection)
from forecast
join products on products.product_id = forecast.product_id
join weeks on weeks.wkno = forecast.for_week
left join inventory on
inventory.product_id = products.product_id
and inventory.asof_week = weeks.wkno
group by
products.product_id,
weeks.wkno,
inventory.qoh
抱歉,我不能给你那么多建议。我希望这会有所帮助。
修改:调整查询以删除交叉联接。原始版本here。如果您想离开联接预测,如果有些人遗失,您可能需要交叉加入。对于您的具体示例,它是不需要的。
编辑2 :上述查询在语义上不正确。 following是正确的,但不能说明我的观点。
select
p.product_id,
p.wkno,
p.qoh,
f.projection
from
(select
products.product_id,
weeks.wkno,
inventory.qoh,
max(forecast.asof_week) max_p
from forecast
join products on products.product_id = forecast.product_id
join weeks on weeks.wkno = forecast.for_week
left join inventory on
inventory.product_id = products.product_id
and inventory.asof_week = weeks.wkno
group by
products.product_id,
weeks.wkno,
inventory.qoh) as p
join forecast f on
f.product_id = p.product_id
and f.for_week = p.wkno
and f.asof_week = p.max_p
答案 1 :(得分:1)
数据中似乎缺少一些PK / FK约束:
CREATE TABLE products (
product_id INTEGER PRIMARY KEY
);
CREATE TABLE weeks (
wkno INTEGER PRIMARY KEY
);
CREATE TABLE inventory (
product_id INTEGER REFERENCES products(product_id)
, asof_week INTEGER REFERENCES weeks(wkno)
, qoh float8
, PRIMARY KEY (product_id,asof_week)
);
CREATE TABLE forecast (
product_id INTEGER REFERENCES products(product_id)
, for_week INTEGER REFERENCES weeks(wkno)
, asof_week INTEGER REFERENCES weeks(wkno)
, projection FLOAT8
, PRIMARY KEY (product_id,for_week,asof_week)
);
INSERT INTO weeks VALUES (4),(5),(6),(7)
, (1),(2),(3), (8) -- need these, too
;
-- et cetera.
如果weeks
表用作“日历”表,则可以(并且应)替换为generate_series(4,7)
伪表。 (并且FK约束下降)
查询受LEFT JOIN + MAX(聚合)构造的影响很大。以下应该做同样的事情,看起来更简单(NOT EXISTS
救援......):
SELECT p.product_id "product"
, i.asof_week "inven asof"
, i.qoh "qoh"
, f.for_week "fcast for"
, f.projection "fcast qty"
, f.asof_week "fcast asof"
FROM products p
CROSS JOIN weeks w
LEFT JOIN inventory i ON i.product_id = p.product_id AND i.asof_week = w.wkno
LEFT JOIN forecast f ON f.product_id = p.product_id AND f.for_week = w.wkno
WHERE NOT EXISTS (
SELECT * FROM forecast f2
WHERE f2.product_id = f.product_id
AND f2.for_week = f.for_week
AND f2.asof_week < f.asof_week
)
AND COALESCE(i.asof_week,f.for_week) IS NOT NULL
ORDER BY p.product_id, i.asof_week, f.for_week, f.asof_week
;
答案 2 :(得分:0)
感谢Julien的提示。这得到了结果,虽然我不确定这是最好的方法,或者一旦我有100多万行,它将如何运作,因为我仍在使用玩具数据集。可能第一件坏事是下面的pw
没有索引。
with pw as ( select * from products, weeks )
select pw.product_id "product",
pw.wkno,
i.asof_week "inven asof",
coalesce(i.qoh::text,'missing') "qoh",
f.for_week "fcast for",
coalesce(f.projection::text,'no fcast') "fcast qty",
f.asof_week "fcast asof"
from pw
left join inventory i on(pw.product_id = i.product_id and pw.wkno = i.asof_week )
left join forecast f on(pw.product_id = f.product_id
and f.for_week = pw.wkno
and f.asof_week = (select max(f2.asof_week)
from forecast f2
where f2.product_id = pw.product_id
and f2.asof_week < pw.wkno
and f2.for_week = pw.wkno))
where
not (i.asof_week is null and f.asof_week is null)
order by pw.product_id,
pw.wkno,
f.for_week,
f.asof_week
产生
product | wkno | inven asof | qoh | fcast for | fcast qty | fcast asof
---------+------+------------+---------+-----------+-----------+------------
1 | 4 | | missing | 4 | 12 | 3
1 | 5 | 5 | 10 | 5 | 31 | 4
1 | 6 | 6 | 20 | 6 | 42 | 5
1 | 7 | | missing | 7 | 16 | 6
2 | 6 | 6 | 200 | 6 | 2000 | 5
2 | 7 | | missing | 7 | 2100 | 5
(6 rows)
Time: 2.999 ms