如何检索时间范围聚合上的所有列?

时间:2018-04-10 10:15:01

标签: mysql impala

我目前正在努力研究如何在其他时间聚合中汇总我的每日数据(周,月,季等)。

以下是我的原始数据类型的样子:

| date     | traffic_type | visits  |
|----------|--------------|---------|
| 20180101 | 1            | 1221650 |
| 20180101 | 2            | 411424  |
| 20180101 | 4            | 108407  |
| 20180101 | 5            | 298117  |
| 20180101 | 6            | 26806   |
| 20180101 | 7            | 12033   |
| 20180101 | 8            | 80368   |
| 20180101 | 9            | 69544   |
| 20180101 | 10           | 39919   |
| 20180101 | 11           | 26291   |
| 20180102 | 1            | 1218490 |
| 20180102 | 2            | 410965  |
| 20180102 | 4            | 108037  |
| 20180102 | 5            | 297727  |
| 20180102 | 6            | 26719   |
| 20180102 | 7            | 12019   |
| 20180102 | 8            | 80074   |

首先,无论traffic_type如何,我都想查看访问次数

SELECT date, SUM(visits) as visits_per_day
FROM visits_tbl
GROUP BY date

结果如下:

|    ymd   | visits_per_day |
|:--------:|:--------------:|
| 20180101 |     2294563    |
| 20180102 |     2289145    |
| 20180103 |     2300367    |
| 20180104 |     2310256    |
| 20180105 |     2368098    |
| 20180106 |     2372257    |
| 20180107 |     2373863    |
| 20180108 |     2364236    |

但是,如果我想检查 visits_per_day 在每个时间聚合中最高的特定日期(例如:月),我正在努力检索正确的输出。

以下是我的所作所为:

SELECT 
   (date div 100) as y_month, MAX(visits_per_day) as max_visit_per_day
FROM
    (SELECT date, SUM(visits) as visits_per_day
    FROM visits_tbl
    GROUP BY date) as t1
GROUP BY
   y_month

这是我的查询的输出:

| y_month | max_visit_per_day |
|:-------:|:-----------------:|
|  201801 |      2435845      |
|  201802 |      2519000      |
|  201803 |      2528097      |
|  201804 |      2550645      |

但是,我不知道visits_per_day最高的确切日期是什么。

期望的输出:

| y_month | max_visit_per_day |    ymd   |
|:-------:|:-----------------:|:--------:|
|  201801 |      2435845      | 20180130 |
|  201802 |      2519000      | 20180220 |
|  201803 |      2528097      | 20180325 |
|  201804 |      2550645      | 20180406 |

ymd 代表visits_per_day最高的日期。 在编程的帮助下,该逻辑将用于仪表板中,以便自动选择时间聚合。 有人可以帮助我吗?

2 个答案:

答案 0 :(得分:0)

这是结构化查询语言结构化部分的工作。也就是说,您将编写一些子查询并将它们视为表。

您已经知道如何查找每天的访问次数。我们将每个月的月份添加到该查询(http://sqlfiddle.com/#!9/a8455e/13/0)。

                   SELECT date DIV 100 as month, date, 
                          SUM(visits) as visits
                     FROM visits_tbl
                    GROUP BY date

接下来,您需要找到每月最多的每日访问次数。 (http://sqlfiddle.com/#!9/a8455e/12/0

       SELECT month, MAX(visits) max_daily_visits
         FROM (
                   SELECT date DIV 100 as month, date, 
                          SUM(visits) as visits
                     FROM visits_tbl
                    GROUP BY date
              ) dayvisits
        GROUP BY month

然后,诀窍是检索每个月发生最大值的日期。这需要加入。没有common table expressions(MySQL缺少),你需要重复第一个子查询。 (http://sqlfiddle.com/#!9/a8455e/11/0

SELECT detail.*
  FROM (
           SELECT month, MAX(visits) max_daily_visits
             FROM (
                       SELECT date DIV 100 as month, date, 
                              SUM(visits) as visits
                         FROM visits_tbl
                        GROUP BY date
                  ) dayvisits
            GROUP BY month
        ) maxvisits
   JOIN (
                       SELECT date DIV 100 as month, date, 
                              SUM(visits) as visits
                         FROM visits_tbl
                        GROUP BY date
        ) detail ON detail.visits = maxvisits.max_daily_visits
                AND detail.month = maxvisits.month

这个相当复杂的查询的大纲有助于解释它。我们将使用名为dayvisits的虚构表来代替该子查询。

SELECT detail.*
  FROM (
           SELECT month, MAX(visits) max_daily_visits
             FROM dayvisits 
            GROUP BY date DIV 100
        ) maxvisits
   JOIN dayvisits detail ON detail.visits = maxvisits.max_daily_visits
                        AND detail.month = maxvisits.month

您正在为子查询中的每个month寻找极值。 (这是一种相当标准的SQL操作。)为此,您可以使用MAX() ... GROUP BY查询找到该值。然后将其连接到子查询本身,以查找与极值对应的其他值。

如果您确实有公用表表达式,则查询将如下所示。你可能会考虑采用名为MariaDB的MySQL分支,它有CTE。

WITH dayvisits AS (
       SELECT date DIV 100 as month, date, 
          SUM(visits) as visits
     FROM visits_tbl
    GROUP BY date
) 
SELECT dayvisits.*
  FROM (
           SELECT month, MAX(visits) max_daily_visits
             FROM dayvisits
            GROUP BY month
        ) maxvisits
   JOIN dayvisits ON dayvisits.visits = maxvisits.max_daily_visits
                AND dayvisits.month = maxvisits.month

答案 1 :(得分:0)

[查询MSSQL]快速高效。

select visit_sum_day_wise.date 
, visit_sum_day_wise.Max_Visits
, visit_sum_day_wise.traffic_type
, LAST_VALUE(visit_sum_day_wise.visits) OVER(PARTITION BY 
visit_sum_day_wise.date  ORDER BY visit_sum_day_wise.date  ROWS BETWEEN 
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS max_visit_per_day
from (
     select visits_tbl.date , visits_tbl.visits , visits_tbl.traffic_type
     ,max(visits_tbl.visits ) OVER (  PARTITION BY visits_tbl.date   ORDER 
     BY  visits_tbl.date  ROWS BETWEEN UNBOUNDED PRECEDING  AND  0  
     PRECEDING) Max_visits
     from visits_tbl  
     ) as visit_sum_day_wise
where visit_sum_day_wise.visits    = (select max(visits_B.visits )  from 
visits_tbl visits_B where visits_B.Date =  visit_sum_day_wise.date )

enter image description here