Question

我使用generate_series和多个连接在postgresql 9.2.4中有一个复杂的（对我来说）SQL查询。我需要在练习表中总结特定日期所有练习的代表，并确保这些练习属于当前用户完成的锻炼。最后，我需要将该表连接到一个系列以显示缺少的日期（使用generate_series）。

我的想法是在from子句中选择系列，然后将系列连接到子查询，该子查询具有练习和训练表之间的内部联接的结果。例如，我有以下查询：

SELECT 
    DISTINCT date_trunc('day', series.date)::date as date,
    sum(COALESCE(reps, 0)) OVER WIN,
    array_agg(workout_id) OVER WIN as ids     
FROM (
    select generate_series(-22, 0) + current_date as date
) series 
LEFT JOIN (
    exercises INNER JOIN workouts 
    ON exercises.workout_id = workouts.id
) 
ON series.date = exercises.created_at::date 
WINDOW 
   WIN AS (PARTITION BY date_trunc('day', series.date)::date)
ORDER BY date ASC;

这给出了以下输出：

    date    | sum |                           ids                           
------------+-----+---------------------------------------------------------
 2013-04-27 |   0 | {NULL}
 2013-04-28 | 432 | {49,48,47,46,45,44,43,42,41,38,37,36,36,36,36,35,34,33}
 2013-04-29 |   0 | {NULL}
 2013-04-30 |  20 | {50}
 2013-05-01 |   0 | {NULL}
 2013-05-02 |   0 | {NULL}
 2013-05-03 |   0 | {NULL}
 2013-05-04 |   0 | {NULL}
 2013-05-05 |   0 | {NULL}
 2013-05-06 |   0 | {NULL}
 2013-05-07 |  40 | {51,51}
 2013-05-08 |   0 | {NULL}
 2013-05-09 |   0 | {NULL}
 2013-05-10 |   0 | {NULL}
 2013-05-11 |   0 | {NULL}
 2013-05-12 |   0 | {NULL}
 2013-05-13 |   0 | {NULL}
 2013-05-14 |   0 | {NULL}
 2013-05-15 |   0 | {NULL}
 2013-05-16 |  20 | {52}
 2013-05-17 |   0 | {NULL}
 2013-05-18 |   0 | {NULL}
 2013-05-19 |   0 | {NULL}
(23 rows)

但是，我想按某些条件进行过滤：

WHERE workouts.user_id = 5

例如。

但是如果我在上面的查询中使用该条件放置WHERE子句，则输出如下：

    date    | sum |                           ids                           
------------+-----+---------------------------------------------------------
 2013-04-28 | 432 | {49,48,47,46,45,44,43,42,41,38,37,36,36,36,36,35,34,33}
 2013-04-30 |  20 | {50}
 2013-05-07 |  40 | {51,51}
 2013-05-16 |  20 | {52}
(4 rows)

这个系列消失了。

如何按user_id过滤并保留系列？任何帮助将不胜感激。

Answer 1

我有一个复杂的（对我而言）SQL查询...

确实，你做到了。但它不一定是那样：

SELECT s.day
      ,COALESCE(sum(w.reps), 0) AS sum_reps  -- assuming reps comes from workouts
      ,array_agg(e.workout_id)  AS ids
FROM   exercises e
JOIN   workouts  w ON w.id = e.workout_id AND w.user_id = 5
RIGHT  JOIN (
   SELECT now()::date + generate_series(-22, 0) AS day
   ) s ON s.day = e.created_at::date 
GROUP  BY 1
ORDER  BY 1;

主要观点：

RIGHT [OUTER] JOIN是LEFT JOIN的反向双胞胎。由于连接是从左到右应用的，因此您不需要这样的括号。
切勿使用基本类型和函数名称date作为标识符。我用day替代。
更新：为了避免聚合/窗口函数sum()的结果中出现NULL，请使用 outer COALESCE，如下所示：COALESCE(sum(reps), 0))
```
sum(COALESCE(reps, 0))
```
您根本不需要date_trunc()。这是一个date开头：
```
date_trunc('day', s.day)::date AS day
```
在这种情况下，您只需使用简单的DISTINCT，而不是复杂且相对昂贵的组合od GROUP BY +窗口函数。

聚合函数和`COALESCE()`

最近在一些问题中对这一点感到困惑。

通常，sum()或其他汇总函数会忽略 NULL 值。结果就像价值根本不存在一样。但是，有一些特殊情况。 The manual advises:

应该注意除count外，这些函数返回a 没有选择行时为null。特别是，sum没有行返回null，而不是像预期的那样为零，array_agg返回null 没有输入行时，而不是空数组。 coalesce function可用于将零或空数组替换为null 必要时。

这个演示应该通过演示角落案例来澄清：

1表没有行。
包含1行的3个表格（NULL / 0 / 1）
包含2行NULL和（NULL / 0 / 1）

测试设置

-- no rows
CREATE TABLE t_empty (i int);
-- INSERT nothing

CREATE TABLE t_0 (i int);
CREATE TABLE t_1 (i int);
CREATE TABLE t_n (i int);

-- 1 row
INSERT INTO t_0 VALUES (0);
INSERT INTO t_1 VALUES (1);
INSERT INTO t_n VALUES (NULL);

CREATE TABLE t_0n (i int);
CREATE TABLE t_1n (i int);
CREATE TABLE t_nn (i int);

-- 2 rows
INSERT INTO t_0n VALUES (0),    (NULL);
INSERT INTO t_1n VALUES (1),    (NULL);
INSERT INTO t_nn VALUES (NULL), (NULL);

查询

SELECT 't_empty'           AS tbl
      ,count(*)            AS ct_all
      ,count(i)            AS ct_i
      ,sum(i)              AS simple_sum
      ,sum(COALESCE(i, 0)) AS inner_coalesce
      ,COALESCE(sum(i), 0) AS outer_coalesce
FROM   t_empty

UNION ALL
SELECT 't_0',  count(*), count(i)
      ,sum(i), sum(COALESCE(i, 0)), COALESCE(sum(i), 0) FROM t_0
UNION ALL
SELECT 't_1',  count(*), count(i)
      ,sum(i), sum(COALESCE(i, 0)), COALESCE(sum(i), 0) FROM t_1
UNION ALL
SELECT 't_n',  count(*), count(i)
      ,sum(i), sum(COALESCE(i, 0)), COALESCE(sum(i), 0) FROM t_n

UNION ALL
SELECT 't_0n', count(*), count(i)
      ,sum(i), sum(COALESCE(i, 0)), COALESCE(sum(i), 0) FROM t_0n
UNION ALL
SELECT 't_1n', count(*), count(i)
      ,sum(i), sum(COALESCE(i, 0)), COALESCE(sum(i), 0) FROM t_1n
UNION ALL
SELECT 't_nn', count(*), count(i)
      ,sum(i), sum(COALESCE(i, 0)), COALESCE(sum(i), 0) FROM t_nn;

结果

   tbl   | ct_all | ct_i | simple_sum | inner_coalesce | outer_coalesce
---------+--------+------+------------+----------------+----------------
 t_empty |      0 |    0 |     <NULL> |         <NULL> |              0
 t_0     |      1 |    1 |          0 |              0 |              0
 t_1     |      1 |    1 |          1 |              1 |              1
 t_n     |      1 |    0 |     <NULL> |              0 |              0
 t_0n    |      2 |    1 |          0 |              0 |              0
 t_1n    |      2 |    1 |          1 |              1 |              1
 t_nn    |      2 |    0 |     <NULL> |              0 |              0

-> SQLfiddle

Ergo，我最初的建议很草率。您可能需要 COALESCE sum() 但如果你这样做，请使用外部 COALESCE。原始查询中的内部COALESCE并未涵盖所有极端情况，并且很少有用。

Answer 2

而不是从WORKOUTS表中获取所有数据，你可以把这个条件放在那里 -

SELECT 
    DISTINCT date_trunc('day', series.date)::date as date,
    sum(COALESCE(reps, 0)) OVER WIN,
    array_agg(workout_id) OVER WIN as ids     
FROM (
    select generate_series(-22, 0) + current_date as date
) series 
LEFT JOIN (
    exercises INNER JOIN (select * from workouts where user_id = 5) workout 
    ON exercises.workout_id = workouts.id
) 
ON series.date = exercises.created_at::date 
WINDOW 
   WIN AS (PARTITION BY date_trunc('day', series.date)::date)
ORDER BY date ASC;

我认为这应该可以为您提供所需的输出。

SQL从generate_series中选择，按user_id过滤删除系列？

2 个答案:

主要观点：

聚合函数和`COALESCE()`

测试设置

查询

结果

SQL从generate_series中选择，按user_id过滤删除系列？

2 个答案:

主要观点：

聚合函数和COALESCE()

测试设置

查询

结果

聚合函数和`COALESCE()`