为什么我通过此加入获得笛卡尔项目?

时间:2013-12-05 00:05:33

标签: mysql sql join

我正在尝试为报告创建查询。我有一个licenses表和一个users表,我有license_assignments作为多个表来为用户分配许可证席位:

mysql> CREATE TABLE license_assignments ( `uid` int(10) unsigned DEFAULT NULL, `lid` int(1) unsigned NOT NULL, `delta` int(10) unsigned NOT NULL, PRIMARY KEY (`lid`, `delta`), KEY `uid` (`uid`));
Query OK, 0 rows affected (0.06 sec)

mysql> INSERT INTO license_assignments VALUES (1, 2, 1), (1,2,2), (1,2,3), (NULL, 2, 4), (NULL, 2, 5), (NULL, 2, 6);
Query OK, 6 rows affected (0.03 sec)
Records: 6  Duplicates: 0  Warnings: 0

mysql> select * FROM license_assignments;
+------+-----+-------+
| uid  | lid | delta |
+------+-----+-------+
| NULL |   2 |     4 |
| NULL |   2 |     5 |
| NULL |   2 |     6 |
|    1 |   2 |     1 |
|    1 |   2 |     2 |
|    1 |   2 |     3 |
+------+-----+-------+
6 rows in set (0.00 sec)

我想要创建的报告必须显示属于特定许可证的许可证席位总数...

mysql> select COUNT(lid) FROM license_assignments all_licenses WHERE lid = 2;
+------------+
| COUNT(lid) |
+------------+
|          6 |
+------------+
1 row in set (0.00 sec)

......还有多少座位未分配(没有相关的用户记录):

mysql> select COUNT(lid) FROM license_assignments unassigned_licenses WHERE lid = 2 AND uid IS NULL;
+------------+
| COUNT(lid) |
+------------+
|          3 |
+------------+
1 row in set (0.00 sec)

然而,当我将这两个查询与自然连接放在一起时,我得到了笛卡尔积(3 x 6 = 18):

mysql> select COUNT(all_licenses.lid) as all_licenses_count, COUNT(unassigned.lid) as unassigned_count FROM license_assignments unassigned, license_assignments all_licenses WHERE unassigned.lid = 2 AND unassigned.uid IS NULL AND all_licenses.lid = 2;
+--------------------+------------------+
| all_licenses_count | unassigned_count |
+--------------------+------------------+
|                 18 |               18 |
+--------------------+------------------+
1 row in set (0.00 sec)

想我只需要添加一个GROUP BY,我这样做了,但它没有帮助:

mysql> select COUNT(all_licenses.lid) as all_licenses_count, COUNT(unassigned.lid) as unassigned_count FROM license_assignments unassigned, license_assignments all_licenses WHERE unassigned.lid = 2 AND unassigned.uid IS NULL AND all_licenses.lid = 2 GROUP BY all_licenses.lid, unassigned.lid;
+--------------------+------------------+
| all_licenses_count | unassigned_count |
+--------------------+------------------+
|                 18 |               18 |
+--------------------+------------------+
1 row in set (0.00 sec)

然后我认为天然连接让我绊倒,所以我尝试了内连接:

mysql> select COUNT(all_licenses.lid) as all_licenses_count, COUNT(unassigned.lid) as unassigned_count FROM license_assignments unassigned INNER JOIN license_assignments all_licenses ON all_licenses.lid = unassigned.lid WHERE unassigned.uid IS NULL;
+--------------------+------------------+
| all_licenses_count | unassigned_count |
+--------------------+------------------+
|                 18 |               18 |
+--------------------+------------------+
1 row in set (0.00 sec)

我没理解什么?我期望执行一个查询,给我这个结果:

mysql> select COUNT( ... ;
+--------------------+------------------+
| all_licenses_count | unassigned_count |
+--------------------+------------------+
|                 6 |               3 |
+--------------------+------------------+
1 row in set (0.00 sec)

但我的定理理论显然是生锈的。我需要做什么?

顺便说一句:

mysql> select version();
+-------------------+
| version()         |
+-------------------+
| 5.5.31-1~dotdeb.0 |
+-------------------+

3 个答案:

答案 0 :(得分:2)

查询比你想象的要简单得多:)

SELECT
  COUNT(*) all_licenses_count,
  COUNT(*) - COUNT(uid) unassigned_count
FROM license_assignments
WHERE lid = 2

COUNT(*)计算行数,而COUNT(uid)计算uid非空行的行。

输出:

| ALL_LICENSES_COUNT | UNASSIGNED_COUNT |
|--------------------|------------------|
|                  6 |                3 |

小提琴here

答案 1 :(得分:1)

正如上面提到的@Mike Brant,您不需要仅仅COUNT()来加入。不确定为什么你需要加入,无论如何你想要,你缺少加入条件。这是一个例子。

SELECT
  unassigned.lid, unassigned.delta
FROM
  license_assignments unassigned JOIN
  license_assignments all_licenses 
  ON unassigned.lid = all_licenses.lid AND unassigned.delta = all_licenses.delta
WHERE
  unassigned.lid = 2 
  AND unassigned.uid IS NULL 
  AND all_licenses.lid = 2
+-----+-------+
| lid | delta |
+-----+-------+
|   2 |     4 |
|   2 |     5 |
|   2 |     6 |
+-----+-------+
3 rows in set (0.00 sec)

如果你检查下面的查询,你可以找出问题所在。

SELECT uid, lid, delta 
FROM license_assignments all_licenses
WHERE lid = 2;
+------+-----+-------+
| uid  | lid | delta |
+------+-----+-------+
|    1 |   2 |     1 |
|    1 |   2 |     2 |
|    1 |   2 |     3 |
| NULL |   2 |     4 |
| NULL |   2 |     5 |
| NULL |   2 |     6 |
+------+-----+-------+
6 rows in set (0.00 sec)

SELECT uid, lid, delta 
FROM license_assignments all_licenses
WHERE lid = 2 AND uid IS NULL;
+------+-----+-------+
| uid  | lid | delta |
+------+-----+-------+
| NULL |   2 |     4 |
| NULL |   2 |     5 |
| NULL |   2 |     6 |
+------+-----+-------+
3 rows in set (0.00 sec)

答案 2 :(得分:1)

你得到的是笛卡尔积,因为一组是六行,盖子= 2,另一组有三行,盖子= 2。集合中的每一行都与另一组中的每一行匹配。

这里JOIN的问题是你需要保证第一组中的一行最多匹配第二组中的一行......你需要一个UNIQUE键上的连接谓词。

如果您绝对需要使用JOIN来获取此结果集,那么这将起作用:

 SELECT COUNT(a.lid) AS all_licenses_count
      , COUNT(u.lid) AS unassigned_count
   FROM license_assignments a
   LEFT
   JOIN license_assignments u
     ON u.lid = a.lid
    AND u.delta = a.delta
    AND u.uid IS NULL
  WHERE a.lid = 2

请注意,JOIN谓词在liddelta上匹配(在表定义中定义为UNIQUE KEY。)因此我们保证第一组(a)中的一行将是最多匹配第二组(u)中的一行。

正如其他答案所指出的那样,使用这样的JOIN并不是获得该结果的最有效方法。

有几种方法可以返回等效结果,但最有效的方法通常是通过执行单个传递来挑选“所有”行,然后使用执行条件测试的表达式确定该行是否应包含在另一个COUNT或SUM聚合中。

我写的是这样的:

 SELECT SUM(1)             AS all_licenses_count
      , SUM(a.uid IS NULL) AS unassigned_count
   FROM license_assignments a
  WHERE a.lid = 2