请复杂的SQL查询建议

时间:2013-10-06 20:58:47

标签: mysql sql

我有三个带有架构的表格如下:

表:应用

| ID (bigint) | USERID (Bigint)|      START_TIME (datetime) | 
-------------------------------------------------------------
|  1          |        13     |         2013-05-03 04:42:55 | 
|  2          |        13     |         2013-05-12 06:22:45 |
|  3          |        13     |         2013-06-12 08:44:24 |    
|  4          |        13     |         2013-06-24 04:20:56 |       
|  5          |        13     |         2013-06-26 08:20:26 |       
|  6          |        13     |         2013-09-12 05:48:27 | 

表:主机

| ID (bigint) | APPID (Bigint)|         DEVICE_ID (Bigint)  | 
-------------------------------------------------------------
|  1          |        1      |                           1 | 
|  2          |        2      |                           1 |
|  3          |        1      |                           1 |    
|  4          |        3      |                           3 |       
|  5          |        1      |                           4 |      
|  6          |        2      |                           3 |

表:用法

| ID (bigint) | APPID (Bigint)|             HOSTID (Bigint) |   Factor (varchar)    |  
-------------------------------------------------------------------------------------
|  1          |        1      |                           1 |               Low     | 
|  2          |        1      |                           3 |               High    | 
|  3          |        2      |                           2 |               Low     | 
|  4          |        3      |                           4 |               Medium  | 
|  5          |        1      |                           5 |               Low     | 
|  6          |        2      |                           2 |               Medium  | 

现在,如果put是userid,我想获得过去6个月每个月“因子”的每个月(所有应用程序)的行表行数

如果DEVICE_ID在一个月内出现多次(基于START_TIME,基于加入应用和主机),则只考虑最新的使用行数(基于应用,主机和使用情况的组合)来计算计数。 / p>

上述示例的查询输出示例应为:(输入用户ID = 13)

| MONTH       | USAGE_COUNT   |               FACTOR        | 
-------------------------------------------------------------
|  5          |        0      |                 High        | 
|  6          |        0      |                 High        | 
|  7          |        0      |                 High        | 
|  8          |        0      |                 High        |       
|  9          |        0      |                 High        |       
|  10         |        0      |                 High        | 
|  5          |        2      |                 Low         | 
|  6          |        0      |                 Low         | 
|  7          |        0      |                 Low         | 
|  8          |        0      |                 Low         |       
|  9          |        0      |                 Low         |       
|  10         |        0      |                 Low         |
|  5          |        1      |                 Medium      | 
|  6          |        1      |                 Medium      | 
|  7          |        0      |                 Medium      | 
|  8          |        0      |                 Medium      |       
|  9          |        0      |                 Medium      |       
|  10         |        0      |                 Medium      |

如何计算?

  1. 2013年5月(05-2013),有两个来自表格应用
  2. 的应用
  3. 在主机表中,这些应用与device_id的1,1,1,4,3
  4. 相关联
  5. 本月(05-2013)对于device_id = 1,start_time的最新值为:2013-05-12 06:22:45(来自表主机,应用),因此在表用法中,查找组合appid = 2& hostid = 2,其中有两行,其中因子为Low,其他为Medium,
  6. 本月(05-2013)对于device_id = 4,按照相同的程序我们得到一个条目,即0低
  7. 同样计算所有值。
  8. 要通过查询获取最近6个月,我正尝试使用以下内容:

    SELECT MONTH(DATE_ADD(NOW(), INTERVAL aInt MONTH)) AS aMonth
        FROM
        (
            SELECT 0 AS aInt UNION SELECT -1 UNION SELECT -2 UNION SELECT -3 UNION SELECT -4 UNION SELECT -5
        ) 
    

    请检查sqlfiddle:http://sqlfiddle.com/#!2/55fc2

1 个答案:

答案 0 :(得分:1)

因为您正在进行的计算涉及多次相同的连接,所以我首先创建了一个视图。

CREATE VIEW `app_host_usage`
AS 
SELECT a.id "appid", h.id "hostid", u.id "usageid",
       a.userid, a.start_time, h.device_id, u.factor
  FROM apps a
  LEFT OUTER JOIN hosts h ON h.appid = a.id
  LEFT OUTER JOIN `usage` u ON u.appid = a.id AND u.hostid = h.id
  WHERE a.start_time > DATE_ADD(NOW(), INTERVAL -7 MONTH)

WHERE条件存在,因为我假设您不希望将2005年7月和2006年7月归为一组。

使用该视图,查询变为

SELECT months.Month, COUNT(DISTINCT device_id), factors.factor
FROM
  (
    -- Get the last six months
    SELECT (MONTH(NOW()) + aInt + 11) % 12 + 1 "Month" FROM
      (SELECT 0 AS aInt UNION SELECT -1 UNION SELECT -2 UNION SELECT -3 UNION SELECT -4 UNION SELECT -5) LastSix
  ) months
  JOIN
  ( 
    -- Get all known factors
    SELECT DISTINCT factor FROM `usage` 
  ) factors
  LEFT OUTER JOIN
  (
    -- Get factors for each device... 
    SELECT 
           MONTH(start_time) "Month", 
           device_id,
           factor
      FROM app_host_usage a
      WHERE userid=13 
        AND start_time IN (
          -- ...where the corresponding usage row is connected
          --    to an app row with the highest start time of the
          --    month for that device.
          SELECT MAX(start_time)
            FROM app_host_usage a2
            WHERE a2.device_id = a.device_id
            GROUP BY MONTH(start_time)
        )
     GROUP BY MONTH(start_time), device_id, factor

  ) usageids ON usageids.Month = months.Month 
            AND usageids.factor = factors.factor
GROUP BY factors.factor, months.Month
ORDER BY factors.factor, months.Month

这是非常复杂的,但我试图评论解释每个部分的作用。请参阅此sqlfiddle:http://sqlfiddle.com/#!2/5c871/1/0