MySQL Query:两个不同级别的聚合

时间:2014-11-23 08:00:31

标签: mysql sql database

我有两张桌子

mysql> select * from report;
+----+----------+------------+------------------+-------------+
| id | campaign | advertiser | impression_count | click_count |
+----+----------+------------+------------------+-------------+
|  1 | camp1    | adv1       |               20 |           6 |
|  2 | camp2    | adv2       |               10 |           2 |
|  3 | camp1    | adv1       |               15 |           3 |
|  4 | camp2    | adv2       |                6 |           1 |
+----+----------+------------+------------------+-------------+
4 rows in set (0.00 sec)

mysql> select * from device;
+-----------+-----------+
| report_id | device_id |
+-----------+-----------+
|         1 | d1        |
|         1 | d2        |
|         2 | d1        |
|         2 | d3        |
|         2 | d4        |
|         3 | d2        |
|         3 | d4        |
|         4 | d3        |
|         4 | d4        |
|         4 | d5        |
+-----------+-----------+
10 rows in set (0.00 sec)

我想要在广告系列和广告客户级汇总的报表,其中包含展示次数和点击次数以及不同的device_ids。所以我在下面写了一下查询

SELECT 
    campaign,
    advertiser,
    sum(impression_count),
    sum(click_count),
    count(DISTINCT device_id)
FROM report 
LEFT JOIN device ON report.id = device.report_id
GROUP BY campaign, advertiser;
+----------+------------+-----------------------+------------------+---------------------------+
| campaign | advertiser | sum(impression_count) | sum(click_count) | count(distinct device_id) |
+----------+------------+-----------------------+------------------+---------------------------+
| camp1    | adv1       |                    70 |               18 |                         3 |
| camp2    | adv2       |                    48 |                9 |                         4 |
+----------+------------+-----------------------+------------------+---------------------------+

此处由于联合展示次数而且click_count聚合为多行。想要的是

+----------+------------+-----------------------+------------------+---------------------------+
| campaign | advertiser | sum(impression_count) | sum(click_count) | count(distinct device_id) |
+----------+------------+-----------------------+------------------+---------------------------+
| camp1    | adv1       |                    35 |               9  |                         3 |
| camp2    | adv2       |                    16 |                3 |                         4 |
+----------+------------+-----------------------+------------------+---------------------------+

http://sqlfiddle.com/#!2/05dd9d/1

发现不是那么好的解决方案

select campaign,advertiser,ic,cc,count(distinct device_id) 
from (
    select 
        group_concat(id) as id,
        sum(impression_count)as ic,
        sum(click_count)as cc,
        campaign,advertiser 
    FROM report har GROUP BY campaign,advertiser) a 
    LEFT JOIN device dr ON FIND_IN_SET(dr.report_id, a.id) 
    group by a.id
);

但是这会使用group concat,如果group_concat结果的长度很大,可能会出现问题。

2 个答案:

答案 0 :(得分:3)

您要做的是做两个不同的查询,然后加入结果集。外部选择只是为了选择我们真正想要的信息,并将两个临时表连接到一个公共值上。如果您不想为整个广告系列选择设备表中的不同设备,也可以使用id和report_id执行此操作。

select `firsttable`.campaign, `firsttable`.advertiser, a, b, c from 
  (select id, campaign, advertiser, sum(impression_count) as a, sum(click_count) as b
   from report
   group by campaign, advertiser
  ) as firsttable
  left join
  (select campaign, advertiser, count(distinct device_id) as c
   from device, report
   where id=report_id
   group by campaign, advertiser
  ) as secondtable on `firsttable`.campaign=`secondtable`.campaign and
                      `firsttable`.advertiser=`secondtable`.advertiser;

SQLFiddle:http://sqlfiddle.com/#!2/8bd63/20

此查询是这两个临时表的组合:

| ID | CAMPAIGN | ADVERTISER |   A |   B |
|----|----------|------------|-----|-----|
|  1 |    camp1 |       adv1 |  35 |   9 |
|  5 |    camp1 |       adv2 | 900 | 900 |
|  2 |    camp2 |       adv2 |  16 |   3 |

| CAMPAIGN | ADVERTISER | C |
|----------|------------|---|
|    camp1 |       adv1 | 3 |
|    camp2 |       adv2 | 4 |

结果:

| CAMPAIGN | ADVERTISER |   A |   B |      C |
|----------|------------|-----|-----|--------|
|    camp1 |       adv1 |  35 |   9 |      3 |
|    camp1 |       adv2 | 900 | 900 | (null) |
|    camp2 |       adv2 |  16 |   3 |      4 |

您的查询存在的问题是,在将报表与设备表组合时会重复行。你最终会得到这样的东西:

| CAMPAIGN | ADVERTISER | IMPRESSION_COUNT | CLICK_COUNT | DEVICE_ID |
|----------|------------|------------------|-------------|-----------|
|    camp1 |       adv1 |               20 |           6 |        d1 |
|    camp1 |       adv1 |               20 |           6 |        d2 |
|    camp2 |       adv2 |               10 |           2 |        d1 |
|    camp2 |       adv2 |               10 |           2 |        d3 |
|    camp2 |       adv2 |               10 |           2 |        d4 |
|    camp1 |       adv1 |               15 |           3 |        d2 |
|    camp1 |       adv1 |               15 |           3 |        d4 |
|    camp2 |       adv2 |                6 |           1 |        d3 |
|    camp2 |       adv2 |                6 |           1 |        d4 |
|    camp2 |       adv2 |                6 |           1 |        d5 |
|    camp1 |       adv2 |              900 |         900 |    (null) |

答案 1 :(得分:0)

也许这可以帮到你:

SELECT 
    campaign,
    advertiser,
    SUM(impression_count) AS ic,
    sum(click_count) as cc,
    (select 
            count(distinct device_id)
        from
            device
        where
            report_id = id) AS DD
from
    report
group by campaign , advertiser;