Question

编辑：看起来像一个索引问题，在问题的底部更新

我有以下查询+子查询，其结果我无法解释。我从这个最小的输入数据集开始（此处的应用程序正在捕获数据更改，而PK是id + tx_id）。

mysql> select * from tag_version;
+----+-------------------+------------+-------+----------------+
| id | name              | article_id | tx_id | operation_type |
+----+-------------------+------------+-------+----------------+
|  1 | some tag          |          1 |     1 |              0 |
|  1 | updated tag       |          1 |     2 |              1 |
|  1 | updated again tag |          1 |     3 |              1 |
|  2 | other tag         |          1 |     2 |              0 |
+----+-------------------+------------+-------+----------------+
4 rows in set (0.00 sec)

子查询，独立

SELECT max(f.tx_id) as max_tx_id, f.id
from tag_version f
WHERE f.tx_id <= 2
GROUP BY f.id

结果是

+-----------+----+
| max_tx_id | id |
+-----------+----+
|         2 |  1 |
|         2 |  2 |
+-----------+----+
2 rows in set (0.00 sec)

我手动注入子查询结果的查询，注意它们如何等于上面的

select t.*
from tag_version t
where t.article_id = 1
AND (t.tx_id, t.id) IN (
    (2,1),
    (2,2)
)

预期结果

+----+-------------+------------+-------+----------------+
| id | name        | article_id | tx_id | operation_type |
+----+-------------+------------+-------+----------------+
|  1 | updated tag |          1 |     2 |              1 |
|  2 | other tag   |          1 |     2 |              0 |
+----+-------------+------------+-------+----------------+
2 rows in set (0.00 sec)

最后，使用子查询代替元组......

select t.*
from tag_version t
where t.article_id = 1
AND (t.tx_id, t.id) IN (
    SELECT max(f.tx_id) as tx_id, f.id
    from tag_version f
    WHERE f.tx_id <= 2
    GROUP BY f.id
)

结果是Empty set (0.00 sec)！有人可以解释一下吗？当我使用EXISTS而不是IN

重新编写查询时，我得到相同的空结果

我注意到，当我从子查询中删除行WHERE f.tx_id <= 2时，我实际上得到了结果（虽然错误的结果）：

+----+-------------------+------------+-------+----------------+
| id | name              | article_id | tx_id | operation_type |
+----+-------------------+------------+-------+----------------+
|  1 | updated again tag |          1 |     3 |              1 |
|  2 | other tag         |          1 |     2 |              0 |
+----+-------------------+------------+-------+----------------+
2 rows in set (0.00 sec)

用JOIN替换子查询实际上会返回预期的正确结果

SELECT t.*
FROM tag_version t
JOIN (
    SELECT max(f.tx_id) as max_tx_id, f.id
    from tag_version f
    WHERE f.tx_id <= 2
    GROUP BY f.id
) as max_ids
ON max_ids.max_tx_id = t.tx_id
AND max_ids.id = t.id
where t.article_id = 1

结果：

+----+-------------+------------+-------+----------------+
| id | name        | article_id | tx_id | operation_type |
+----+-------------+------------+-------+----------------+
|  1 | updated tag |          1 |     2 |              1 |
|  2 | other tag   |          1 |     2 |              0 |
+----+-------------+------------+-------+----------------+
2 rows in set (0.00 sec)

此外，使用PostgreSQL和SQLite在同一数据集上运行相同的查询+子查询可以得到预期的正确结果。

我的MySQL版本是Server version: 5.5.40-0ubuntu0.14.04.1 (Ubuntu)。

我认为找出正在发生的事情的线索是，当我从子查询中删除WHERE时，我实际上得到了结果，但是我无法从中获得有用的东西。

编辑：使用输入数据集更新

编辑：添加表格信息

表create语句如下

CREATE TABLE `tag_version` (
  `id` int(11) NOT NULL,
  `name` varchar(255) DEFAULT NULL,
  `article_id` int(11) DEFAULT NULL,
  `tx_id` bigint(20) NOT NULL,
  `operation_type` smallint(6) NOT NULL,
  PRIMARY KEY (`id`,`tx_id`),
  KEY `ix_tag_version_operation_type` (`operation_type`),
  KEY `ix_tag_version_tx_id` (`tx_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

数据人口

insert into tag_version (id, name, article_id, tx_id, operation_type) VALUES 
(1, 'some tag', 1, 1, 0),
(1, 'updated tag', 1, 2, 1),
(1, 'updated again tag', 1, 3, 1),
(2, 'other tag', 1, 2, 0)
;

当我删除ix_tag_version_tx_id索引时，查询会返回正确的结果......解释为什么会有用。

Answer 1

我相信你在显示第一个代码（子查询）的结果时犯了一个错误。

此查询的输出：

SELECT max(f.tx_id) as qwer, f.id
from tag_version f
WHERE f.tx_id <= 2
GROUP BY f.id

- 不是：

+--------------+----+
| max(f.tx_id) | id |
+--------------+----+
|            2 |  1 |
|            2 |  2 |
+--------------+----+

是：

+------+----+
| qwer | id |
+------+----+
|   2  |  1 |
|   2  |  2 |
+------+----+

（*注意：代码行max(f.tx_id) as qwer *）

现在尝试使用此代码来执行exxted输出。选择max(f.tx_id)时会发生变化。

select t.*
from tag_version t
where t.`article_id` = 1
AND t.`operation_type` != 2
AND (t.`tx_id`, t.`id`) IN (
    SELECT max(f.`tx_id`) as `tx_id`, f.`id`
    from tag_version f
    WHERE f.`tx_id` <= 2
    GROUP BY f.`id`
)

如果这会给您带来结果或任何其他错误，请告诉我。

MySQL谜语子查询

1 个答案: