针对postgresql

时间:2016-08-25 05:41:55

标签: sql postgresql

我是SQL的新手,我确信这一定是常见的问题,但我找不到解决方案。因此,如果你能够指出我正确的方向,那将是伟大的。

我有一个名为预测的表,其中包含产品的所有预测列表,其中产品是一个单独的表,其中包含所有唯一产品的列表,其中数字作为标识符。

我正在尝试根据预测表

计算所有产品的平均预测值

预测表

CREATE TABLE forecasts
(
  id INTEGER PRIMARY KEY NOT NULL,
  month DATE,
  quantity INTEGER,
  extract_date DATE,
  product_number VARCHAR,
  final BOOLEAN DEFAULT false
);

我目前正在使用以下查询并迭代轨道上的ruby中的每个项目以生成平均预测

单品平均查询

WITH three_month_forecast AS (
  SELECT product_number, month, sum(quantity) as forecast
  FROM forecasts
  WHERE extract_date >= '2016-08-01'::DATE - INTERVAL '1 month'
    AND extract_date < '2016-08-01'::DATE
    AND final = TRUE
    AND month >= '2016-08-01'::DATE
    AND month < '2016-08-01'::DATE + INTERVAL '3 months'
   AND product_number = '100046119'
 GROUP BY product_number, month, extract_date
 ORDER BY month
)

SELECT avg(forecast) FROM three_month_forecast

产品数据库中有大约100k项目,因此需要一段时间才能完成rails。它在SQL中应该快得多,而不必分别迭代每个项目。

知道我如何为产品数据库中的所有项目运行平均查询,以便它返回一个类似于

的表格

product_number | average_forecast

感谢任何帮助。感谢

修改

包含一个产品的样本计算的sqlfiddle http://sqlfiddle.com/#!15/ab1637/2

修改2

添加查询说明。该表目前没有任何索引

Aggregate  (cost=15074.16..15074.17 rows=1 width=8) (actual time=432.189..432.190 rows=1 loops=1)
      CTE three_month_forecast
        ->  Sort  (cost=15074.13..15074.14 rows=1 width=22) (actual time=431.935..431.935 rows=3 loops=1)
              Sort Key: forecasts.month
              Sort Method: quicksort  Memory: 25kB
              ->  HashAggregate  (cost=15074.11..15074.12 rows=1 width=22) (actual time=431.354..431.363 rows=3 loops=1)
                    ->  Seq Scan on forecasts  (cost=0.00..15074.08 rows=3 width=22) (actual time=0.765..431.255 rows=3 loops=1)
                          Filter: (final AND (extract_date >= '2016-07-01 00:00:00'::timestamp without time zone) AND (extract_date < '2016-08-01'::date) AND (month >= '2016-08-01'::date) AND (month < '2016-11-01 00:00:00'::timestamp without time zone) AND ((product_number)::text = '100046119'::text))
                          Rows Removed by Filter: 442623
      ->  CTE Scan on three_month_forecast  (cost=0.00..0.02 rows=1 width=8) (actual time=431.959..431.962 rows=3 loops=1)
    Total runtime: 432.513 ms

最终修改

@QuoVadis解决方案完美无缺。当我看到解决方案时,它是如此明显。感谢。

1 个答案:

答案 0 :(得分:0)

这应该这样做。

WITH three_month_forecast AS (
SELECT product_number, month, sum(quantity) as forecast
FROM forecasts
WHERE extract_date >= '2016-08-01'::DATE - INTERVAL '1 month'
  AND extract_date < '2016-08-01'::DATE
  AND final = TRUE
  AND month >= '2016-08-01'::DATE
  AND month < '2016-08-01'::DATE + INTERVAL '3 months'
  AND product_number in (select distinct product_number from forecasts)
GROUP BY product_number, month, extract_date
ORDER BY month
)
SELECT avg(forecast), product_number FROM three_month_forecast
group by product_number

参见示例here