计算2列sql之间的平均值

时间:2014-12-21 15:37:07

标签: mysql sql join

我有一个名为validation_errors的表,如下所示:

+-------------+--------------+------+-----+---------+----------------+
| Field       | Type         | Null | Key | Default | Extra          |
+-------------+--------------+------+-----+---------+----------------+
| id          | int(11)      | NO   | PRI | NULL    | auto_increment |
| link        | varchar(200) | NO   | MUL | NULL    |                |
| message     | varchar(500) | NO   |     |         |                |
| explanation | mediumtext   | NO   |     | NULL    |                |
| type        | varchar(50)  | NO   |     |         |                |
| subtype     | varchar(50)  | NO   |     |         |                |
| message_id  | varchar(50)  | NO   |     |         |                |
+-------------+--------------+------+-----+---------+----------------+

链接表如下所示:

+-----------+--------------+------+-----+---------+-------+
| Field     | Type         | Null | Key | Default | Extra |
+-----------+--------------+------+-----+---------+-------+
| link      | varchar(200) | NO   | PRI | NULL    |       |
| visited   | tinyint(1)   | NO   |     | 0       |       |
| validated | tinyint(1)   | NO   |     | 0       |       |
+-----------+--------------+------+-----+---------+-------+

我希望计算每个topdomain每页的平均验证错误数。 我有一个查询可以获取每个topdomain的页面数量:

    SELECT substr(link, - instr(reverse(link), '.')) as domain , count(*) as count
    FROM links
    GROUP BY domain
    ORDER BY count desc
    limit 30;

并且有一个sql查询可以获取每个顶级域的验证错误数量:

    SELECT substr(link, - instr(reverse(link), '.')) as domain ,count(*) as count
    FROM validation_errors
    GROUP BY domain
    ORDER BY count desc
    limit 30;

我现在需要做的是将它们组合成一个查询并将一列的结果与另一列分开,我无法弄清楚如何去做。

任何帮助都会受到极大的关注。

1 个答案:

答案 0 :(得分:0)

首先,使用substring_index(),而不是你的构造。以下是将它们连接在一起的查询:

select domain, sum(numviews) as numviews, sum(numerrors) as numerrors,
       sum(numerrors) / nullif(sum(numviews), 0) as error_rate
from ((SELECT substring_index(link, '.', -1) as domain , count(*) as numviews, 0 as numerrors
       FROM links
       GROUP BY domain
      ) UNION ALL
      (SELECT substring_index(link, '.', -1) as domain , 0, count(*)
       FROM validation_errors
       GROUP BY domain
      )
     ) d
GROUP BY domain;

对于这两个变量,我不知道您要选择哪30个,所以我还没有包含order by

请注意,这不会使用join,而是使用union all进行聚合。这可确保您获得所有域,即使那些没有视图的域和没有错误的域。