Question

我的Postgres DB中有一个表，看起来像这样：

date          duration
2018-05-10      10
2018-05-12      15
2018-06-01      10
2018-06-02      20
2019-01-01      5
2019-01-02      15
2019-04-01      10

我希望将每个月的值求和，然后按年，月和月的数字将它们分组为如下所示：

year    month    month_number   monthly_sum
2018    May         5              25
2018    June        6              30
2019    Jan         1              20
2019    Apr         4              10

最后出现如下查询：

SELECT 
  to_char(date_trunc('month', date), 'YYYY') AS year,
  to_char(date_trunc('month', date), 'Mon') AS month,
  to_char(date_trunc('month', date), 'MM') AS month_number,
  sum(duration) AS monthly_sum
FROM timesheet 
GROUP BY year, month, month_number

它工作得很好，我的问题是：这个查询被认为是不好的吗？如果有10万行，它会影响性能吗？我听说使用to_char不如date_trunc，这是我在这里要避免的方法，我只是将date_trunc包装在to_char中。另外，在GROUP BY子句中具有三个值，是否会影响任何内容？

Answer 1

使用功能并相应地使用功能进行分组可能会降低性能。为此，最好使用Calendar表带有适当的索引，这样您就不必在每个表上都处理此类问题。

Check This和this (Calendar Table)

Answer 2

查询还不错，但是您可以简化它。

SELECT to_char(date_trunc('month', date), 'YYYY') AS year,
       to_char(date_trunc('month', date), 'Mon') AS month,
       to_char(date_trunc('month', date), 'MM') AS month_number,
       sum(duration) AS monthly_sum
FROM timesheet 
GROUP BY date_trunc('month', date);

从性能的角度来看，较短的GROUP BY键对性能的影响很小，但这不是我担心的事情。

Answer 3

由于查询没有任何过滤条件，因此它将始终读取表的所有行：这是对性能的主要影响。如果您有过滤条件，则最好使用正确的索引。

话虽如此，您提取年月的方式可能会有所改善，如此处的其他答案所示，但这对查询的性能影响很小。

总而言之，在没有过滤条件的情况下，您的查询已接近最佳状态。

使用Postgres汇总列值和按月分组日期

3 个答案: