Rails:按天计算统计数据和分组

时间:2016-04-24 13:31:34

标签: ruby-on-rails postgresql

我正在计算LaserSheet模型的统计信息,以便为仪表板页面构建morris.js图表​​。我目前正在使用一个统计数据:

# Show four Mondays ago up to this coming Sunday (4 weeks)
start_date = Time.zone.now.beginning_of_week - 3.weeks
end_date   = Time.zone.now.end_of_week

# Calculate sheets cut per day
empty_dates_hash = Hash[(start_date.to_date..end_date.to_date).collect { |v| [v, 0] }]  
recent_cut_stats = LaserSheet.where('cut_at IS NOT NULL')
                             .where('cut_at > ?', start_date.beginning_of_day)
                             .where('cut_at < ?', end_date.end_of_day)
                             .group("DATE(cut_at::TIMESTAMPTZ AT TIME ZONE '#{Time.zone.now.formatted_offset}'::INTERVAL)")
                             .count
recent_cut_stats = empty_dates_hash.merge(recent_cut_stats)

我想添加一个历史性的&#34;剩下要剪切的纸张。 stat按天分组。为此,我需要在该日期或之前LaserSheets找到created_at的所有cut_at,其中NULLLaserSheet.where('created_at < ?', Time.zone.yesterday.end_of_day) .where('cut_at IS NULL OR cut_at > ?', Time.zone.yesterday.end_of_day) .count 或晚于该日期。< / p>

我昨天可以手动执行此操作:

LaserSheet.where('created_at < ?', Time.zone.today.end_of_day)
          .where('cut_at IS NULL OR cut_at > ?', Time.zone.today.end_of_day)
          .count

今天:

[start_date..end_date]

我可以在CD中每天重复这一点,但效率很低。有没有办法用一个数据库查询来完成这个?它并不像白天简单分组一样微不足道。

我正在使用PostgreSQL和Rails 4。

1 个答案:

答案 0 :(得分:0)

我将无法编写远程高效甚至语法上有效的ruby代码,但这里有一些可能对您有帮助的原始SQL。

您应该使用generate_series生成此列表中的日期列表和左连接:

SELECT *
FROM generate_series(
    date_trunc('day', now()) - CAST('7 day' AS interval),
    date_trunc('day', now()),
    CAST('1 day' AS interval)
);

    generate_series
------------------------
 2016-04-17 00:00:00+00
 2016-04-18 00:00:00+00
 2016-04-19 00:00:00+00
 2016-04-20 00:00:00+00
 2016-04-21 00:00:00+00
 2016-04-22 00:00:00+00
 2016-04-23 00:00:00+00
 2016-04-24 00:00:00+00
(8 rows)

既然您知道如何生成一堆日期,那么您所要做的就是使用正确的子句加入这些日期。

但首先,我们需要一些测试数据:

SELECT
    CAST(created_at AS timestamp),
    CAST(cut_at AS timestamp)
FROM (
    VALUES
        ('2016-04-20', null),           /* not cut yet */
        ('2016-04-20', '2016-04-22'),   /* cut 2 days ago */
        ('2016-04-20', null),           /* not cut yet */
        ('2016-04-23', '2016-04-23'),   /* cut yesterday */
        ('2016-04-23', null),           /* not cut yet */
        ('2016-04-24', '2016-04-24'),   /* cut today */
        ('2016-04-24', '2016-04-26')    /* cut tomorrow (because I can :p) */
) as laser_sheet(created_at, cut_at);


     created_at      |       cut_at
---------------------+---------------------
 2016-04-20 00:00:00 |
 2016-04-20 00:00:00 | 2016-04-22 00:00:00
 2016-04-20 00:00:00 |
 2016-04-23 00:00:00 | 2016-04-23 00:00:00
 2016-04-23 00:00:00 |
 2016-04-24 00:00:00 | 2016-04-24 00:00:00
 2016-04-24 00:00:00 | 2016-04-26 00:00:00
(7 rows)

最终查询应如下所示:

WITH date_serie AS (
    /* generate one row by day for the last 7 days */
    SELECT generate_series as day
    FROM generate_series(
        /* replace "CAST('2016-04-24 16:56:23' AS datetime)" with "now()" to get a dynamic view */
        date_trunc('day', CAST('2016-04-24 16:56:23' AS timestamp)) - CAST('7 day' AS interval),
        date_trunc('day', CAST('2016-04-24 16:56:23' AS timestamp)),
        CAST('1 day' AS interval)
    )
),
laser_sheet AS (
    /* below is some test data */
    SELECT
        CAST(created_at AS timestamp) AS created_at,
        CAST(cut_at AS timestamp) AS cut_at
    FROM (
        VALUES
            ('2016-04-20', null),           /* not cut yet */
            ('2016-04-20', '2016-04-22'),   /* cut 2 days ago */
            ('2016-04-20', null),           /* not cut yet */
            ('2016-04-23', '2016-04-23'),   /* cut yesterday */
            ('2016-04-23', null),           /* not cut yet */
            ('2016-04-24', '2016-04-24'),   /* cut today */
            ('2016-04-24', '2016-04-26')    /* cut tomorrow (because I can :p) */
    ) as laser_sheet(created_at, cut_at)
)
SELECT
    date_serie.day,
    /* we need to count if any laser_sheet matches this day */
    count(laser_sheet.*) as sheets_left_to_cut
FROM
    date_serie
    LEFT JOIN laser_sheet
    /* notice here your custom join clause */
    ON laser_sheet.created_at < date_serie.day
    AND (
        laser_sheet.cut_at IS NULL
        OR laser_sheet.cut_at > date_serie.day
    )
GROUP BY
    date_serie.day
ORDER BY
    date_serie.day
;

这是结果

         day         | sheets_left_to_cut
---------------------+--------------------
 2016-04-17 00:00:00 |                  0
 2016-04-18 00:00:00 |                  0
 2016-04-19 00:00:00 |                  0
 2016-04-20 00:00:00 |                  0
 2016-04-21 00:00:00 |                  3
 2016-04-22 00:00:00 |                  2
 2016-04-23 00:00:00 |                  2
 2016-04-24 00:00:00 |                  3
(8 rows)