MySQL数据平滑

时间:2012-03-17 01:07:33

标签: mysql average smoothing

我有一个MySQL数据库,包含内存数据和时间戳。非常简单的数据,如使用的内存和系统中可用的内存总量。 现在我想在对这些数据进行一些简单的计算之后创建一个MySQL VIEW,以实现某种程度的数据平滑(使用平均滚动窗口)。

初始表如下所示:

id |date                     |mem_used    |mem_total
1  |2012-03-16 23:29:05      |467         |1024
2  |2012-03-16 23:30:05      |432         |1024
3  |2012-03-16 23:31:05      |490         |1024
4  |2012-03-16 23:33:05      |501         |1024
5  |2012-03-16 23:35:05      |396         |1024
6  |2012-03-16 23:39:05      |404         |1536
7  |2012-03-16 23:43:05      |801         |1536

创建的VIEW应如下所示:

id |date                     |mem_used    |mem_total    |mem_5_min_avg    |mem_rate_usage
1  |2012-03-16 23:29:05      |467         |1024         |473              |0.46191406
2  |2012-03-16 23:30:05      |432         |1024         |455              |0.44433594
3  |2012-03-16 23:31:05      |490         |1024         |463              |0.45214844
4  |2012-03-16 23:33:05      |501         |1024         |449              |0.43847656
5  |2012-03-16 23:35:05      |396         |1024         |396              |0.38671875
6  |2012-03-16 23:39:05      |404         |1536         |603              |0.39257813
7  |2012-03-16 23:43:05      |801         |1536         |801              |0.52148438

要求:

前3列是相同的,但 mem_5_min_avg 列应该包含以下5分钟的平均已用内存,因为 mem_total 是相同的(< strong> mem_total 正在改变。)

因此,以下行应按如下方式计算:

  • mem_5_min_avg列的第1行(467 + 432 + 490 + 501)/ 4 = 1890/4 = 472.5 = 473 &lt; - 我们在这里总结4行因为2012-03-16 23:29: 05加5分钟2012-03-16 23:34:05
  • mem_5_min_avg列的第2行(432 + 490 + 501 + 396)/ 4 = 1819/4 = 454.75 = 455
  • mem_5_min_avg列的第3行(490 + 501 + 396)/ 3 = 1387/4 = 462.33 = 463
  • mem_5_min_avg列的第4行(501 + 396)/ 2 = 897/2 = 448.5 = 449
  • mem_5_min_avg列的第5行396 &lt; - 我们在这里不对任何行求和,因为即使下一次测量在5分钟内,mem_total也会发生变化。
  • mem_5_min_avg列的第6行(404 + 801)/ 2 = 1205/2 = 602.5 = 603
  • mem_5_min_avg列801的第7行

在计算 mem_5_min_avg 之后,我需要 mem_rate_usage 列,该列显示以百分比给出的内存使用量的简单比率。

mem_rate_usage = mem_5_min_avg / mem_total

例如 mem_rate_usage 的第3行应该计算为463/1024 = 0.45214844,而最后一列应该像这样计算801/1536 = 0.52148438

我对如何处理此问题一无所知。我已尝试将“AVG”功能与“GROUP by”结合使用,但我实际上并不想在此处进行任何分组。我想在创建的视图中拥有相同数量的行和数据,另外还有平滑的数据和速率。

2 个答案:

答案 0 :(得分:0)

更新2:

进一步改进了查询,但仍然很慢。我意识到TIMESTAMPDIFF比UNIX_TIMESTAMP之间的直接比较要慢得多。因此,像这样更改UPDATE 1的代码,我们可以将速度提高近20%。

增加my.cnf中的innodb_buffer_pool_size选项有助于提高速度。

SELECT  `date` ,  `mem_used` ,  `mem_total` , `mem_5_min_avg` , 
(`mem_5_min_avg` / `mem_total`) AS mem_usage_rate
FROM (
   SELECT *, (
      SELECT CEILING( AVG( mem_used ) )
      FROM `data` AS t2
      WHERE UNIX_TIMESTAMP(t2.date) - UNIX_TIMESTAMP(t1.date) <=300 
      AND t2.date >= t1.date
      AND t1.mem_total = t2.mem_total
      AND t1.host_id = t2.host_id
   ) AS mem_5_min_avg
   FROM `data` AS t1
) AS t1

更新1: 我改进了查询以提供两倍的速度,但对于我的大桌来说它仍然很慢。

SELECT  `date` ,  `mem_used` ,  `mem_total` , `mem_5_min_avg` , 
(`mem_5_min_avg` / `mem_total`) AS mem_usage_rate
FROM (
   SELECT *, (
      SELECT CEILING( AVG( mem_used ) )
      FROM `data` AS t2
      WHERE TIMESTAMPDIFF(
      MINUTE , t1.date, t2.date ) <=5
      AND t2.date >= t1.date
      AND t1.mem_total = t2.mem_total
   ) AS mem_5_min_avg
   FROM `data` AS t1
) AS t1

INITIAL POST

我在ubuntuforums中提出了同样的问题,TeoBigusGeekus给出了这个答案,它完全按照它必须工作的方式工作,但对于我拥有更多100000行的大表来说,这是非常慢的。如果我将查询限制为30行并且超过20秒,如果我将其限制为100,则执行需要7.5秒。我猜这将持续100000行。无论如何,对于任何对此解决方案感兴趣的人都是:

SELECT  `date` ,  `mem_used` ,  `mem_total` , (
   SELECT CEILING( AVG( mem_used ) )
   FROM mytable AS t2
   WHERE TIMESTAMPDIFF(
   MINUTE , t1.date, t2.date ) <=5
   AND t2.date >= t1.date
   AND t1.mem_total = t2.mem_total
) AS mem_5_min_avg, (
   SELECT CEILING( AVG( mem_used ) ) / mem_total
   FROM mytable AS t3
   WHERE TIMESTAMPDIFF(
   MINUTE , t1.date, t3.date ) <=5
   AND t3.date >= t1.date
   AND t1.mem_total = t3.mem_total
) AS mem_rate_usage
FROM mytable AS t1

答案 1 :(得分:0)

SELECT
    rrd1.id,
    rrd1.date,
    rrd1.mem_used,
    rrd1.mem_total,
    (
        SELECT
            CEILING(AVG(rrd2.mem_used))
        FROM
            rrd rrd2
        WHERE
            rrd2.date >= rrd1.date AND
            rrd2.date <= AddTime(rrd1.date, '00:05')
    ) AS mem_5_min_avg
FROM
    rrd rrd1