从四个mysql表中选择count和sum列以及where子句

时间:2014-11-03 16:24:11

标签: mysql join count group-by

我的架构中有5个表。

首先是opn

| opnID | submitID | emailID |       opnDate       | invalidOPN |
+-------+----------+---------+---------------------+------------+
|   1   |    6     |    1    | 2014-10-15 11:45:50 |      2     |
|   2   |    6     |    2    | 2014-10-15 11:55:52 |      0     |
|   3   |    6     |    3    | 2014-10-15 12:41:52 |     10     |
|   4   |    7     |    2    | 2014-10-15 17:45:22 |      1     |
|   5   |    7     |    3    | 2014-10-16 00:45:55 |      5     |
|   6   |    6     |    5    | 2014-10-16 01:45:11 |      0     |

我也有clk

| clkID| submitID | emailID |       clkDate        | invalidCLK |
+-------+----------+---------+---------------------+------------+
|   1   |    6     |    1    | 2014-10-15 11:45:55 |      1     |
|   2   |    6     |    2    | 2014-10-15 11:55:59 |      0     |
|   3   |    6     |    3    | 2014-10-15 12:42:52 |      5     |
|   4   |    7     |    3    | 2014-10-15 17:46:12 |      0     |
|   5   |    6     |    5    | 2014-10-16 00:46:55 |      0     |

users表:

| userID | firstName | secondName |
+--------+-----------+------------+
|   1    |    john   |    smith   |
|   1    |   susan   |    bella   |

submission表:

| submitID | userID |
+----------+--------+
|    6     |   1    |
|    7     |   2    |

我需要计算opn.submitID来获取open和count clk.submitID的数量,以获得每个用户的点击次数和invalidclk以及invalidopn的总数。

以下是我的预期结果:

| userID | fName | sName | numberOfOpen | SUM(opn.invalidOPN) | numberOfClicks | SUM(clk.invalidCLK) |
+--------+-------+-------+--------------+---------------------+----------------+---------------------+
|   1    | john  | smith |      4       |          12         |        4       |         6           |
|   2    | susan | bella |      2       |           6         |        1       |         0           |

我尝试了这两个查询,但我没有达到我需要的结果

SELECT users.userID, users.FirstName, users.SecondName, count(opn.submitID) as "Number of Opens", sum(opn.InvalidOPN) as "Number of invalid Opens"

FROM users 

RIGHT JOIN ( submission INNER JOIN opn ON opn.submitID = submission.submitID and OPNDate between "2013-10-01 00:00:00" AND "2014-10-31 23:59:59" ) ON submission.UserID = users.UserID   group by users.userID 

UNION

SELECT users.userID, users.FirstName, users.SecondName, count(clk.submitID) as "Number of clicks", sum(clk.InvalidCLK) as "Number of invalid clicks"
FROM users
RIGHT JOIN ( submission INNER JOIN clk ON clk.submitID = submission.submitID and CLKDate between "2013-10-01 00:00:00" AND "2014-10-31 23:59:59") ON submission.UserID = users.UserID  group by users.userID 

SELECT users.userID, users.FirstName, users.SecondName, count(opn.submitID) as "Number of Opens", sum(opn.InvalidOPN) as "Number of invalid Opens", count(clk.submitID) as "Number of clicks", sum(clk.InvalidCLK) as "Number of invalid clicks"

FROM users, submission, clk, opn 

where opn.submitID = submission.submitID and clk.submitID = submission.submitID 
And CLKDate between "2013-10-01 00:00:00" AND "2014-10-31 23:59:59" 
AND submission.UserID = users.UserID  group by users.userID

请帮帮我,告诉我需要改变什么。

2 个答案:

答案 0 :(得分:1)

执行此操作的主要问题是,您需要相互连接表并获取opn和clk记录的每个组合。在这种情况下,您需要使用 COUNT(DISTINCT some_field_name)来计算唯一值: -

SELECT users.UserId
        COUNT(DISTINCT opn.OPNID),
        COUNT(DISTINCT clk.CLKID)
FROM users
LEFT OUTER JOIN submission ON users.UserId = submission.UserId 
LEFT OUTER JOIN opn ON submission.SubmitID = opn.SubmitID 
LEFT OUTER JOIN clk ON submission.SubmitID = clk.SubmitID 
GROUP BY users.UserId

然而,在这种情况下,这并没有帮助,因为您还需要无效___字段的总和。

因此我建议使用几个子查询,一个用于clk,一个用于opn。这些获取按用户ID分组的计数和总和。并且这些子查询的结果将连接到users表。

这样的事情: -

SELECT users.UserId,
        users.fName,
        users.sName,
        numberOfOpen,
        COALESCE(invalidopnsum, 0),
        numberOfClicks,
        COALESCE(invalidclksum, 0)
FROM users
LEFT OUTER JOIN
(
    SELECT submission.UserId, COUNT(opn.SubmitID) AS numberOfOpen, SUM(opn.InvalidOPN) AS invalidopnsum
    FROM submission 
    LEFT OUTER JOIN opn ON submission.SubmitID = opn.SubmitID 
    GROUP BY submission.UserId
) opn1
ON users.UserId = opn1.UserId 
LEFT OUTER JOIN
(
    SELECT submission.UserId, COUNT(clk.SubmitID) AS numberOfClicks, SUM(clk.InvalidCLK) AS invalidclksum
    FROM submission 
    LEFT OUTER JOIN clk ON submission.SubmitID = clk.SubmitID 
    GROUP BY submission.UserId
) clk1
ON users.UserId = clk1.UserId 

答案 1 :(得分:0)

这比你想象的要容易。您需要做的就是将所有表连接在一起并使用一些聚合函数。您可以在userID列上添加提交的用户,并且可以使用submitID列将其与clk和opn表连接。您可以使用COUNT()来获取打开和单击的数量,使用SUM()来获取无效列的总数。但是,这些计算一下子就不会起作用,因为有些事实会重复,所以我建议你单独执行每个查询,然后加入它们。

查询如下所示:

SELECT t.userID, t.firstName, t.secondName, t.numOpen, t.totalInvalidOpen, w.numClick, w.totalInvalidClick
FROM (SELECT u.userID, u.firstName, u.secondName, COUNT(*) AS numOpen, SUM(o.invalidOPN) AS totalInvalidOpen
      FROM users u
      JOIN submission s ON s.userID = u.userID
      JOIN opn o ON o.submitID = s.submitID
      GROUP BY u.userID
      ) t
JOIN (SELECT u.userID, u.firstName, u.secondName, COUNT(*) AS numClick, SUM(c.invalidCLK) AS totalInvalidClick
      FROM users u
      JOIN submission s ON s.userID = u.userID
      JOIN clk c ON c.submitID = s.submitID
      GROUP BY u.userID
      ) w
ON w.userID = t.userID;

它有效!这是你的SQL Fiddle注意:您在问题中的结果集不正确,因为用户1的总无效开放仅为12。