我有两张桌子:
1)任务 - 代表任务。它只有一个主键,因为所有相关数据都在task_version表中(任务HAS_MANY task_version)。
CREATE TABLE task(
id int(11) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (id)
);
样本数据:
INSERT INTO task VALUES ('1');
INSERT INTO task VALUES ('2');
2) task_version - 任何任务中的任何更改都会在此表中创建新行。 task_id应该是外键(为简单起见,省略)。这是完成任务中所有更改的完整主题。
CREATE TABLE `task_version` (
id int(10) unsigned NOT NULL AUTO_INCREMENT,
task_id int(11) DEFAULT NULL,
name varchar(255) DEFAULT NULL,
text varchar(255) DEFAULT NULL,
status int(11) DEFAULT NULL,
PRIMARY KEY (id)
);
示例数据:
INSERT INTO `task_version` VALUES ('1', '1', 'Name of task', 'Text of task', '1');
INSERT INTO `task_version` VALUES ('2', '1', 'Name of task', 'Text of task', '1');
INSERT INTO `task_version` VALUES ('3', '1', 'Name of task', 'Text of task', '2');
INSERT INTO `task_version` VALUES ('4', '1', 'Name of task', 'Text of task', '1');
INSERT INTO `task_version` VALUES ('5', '2', 'Name', 'Text', '1');
我需要的是获得每项任务的状态变化数量。
显然我不能只查询这样的不同状态:
SELECT
(
SELECT
COUNT(DISTINCT status)
FROM task_version
WHERE task_id = t.id
) AS distinct_statuses_per_task,
t.id AS task_id
FROM task t
INNER JOIN task_version tv ON t.id = tv.task_id
GROUP BY t.id
因为distinct_statuses_per_task只是不同的值而不会改变qunatity。如果有人将状态从1更改为2,从2更改为1,再从1更改为2,我们将获得此状态序列:
1
2
1
2
因此,我们有2种不同的状态(1,2),但有3种状态变化(1> 2,2> 1,1> 2),所以它不起作用。
我用MySQL用户变量开发了解决方案。这是我想嵌入主查询的子查询:
SELECT
CASE WHEN (status != @prev_status AND @prev_status IS NOT NULL)
THEN @status_changes_quantity := @status_changes_quantity + 1
END as incrementing_logic,
@status_changes_quantity AS status_changes_quantity,
@prev_status := status AS save_prev
FROM task_version,
(
SELECT
@prev_status := NULL,
@status_changes_quantity := 0
) as task_version_with_additional_vars
WHERE task_id = 1 --Hardcoded task_id
ORDER BY status_changes_quantity DESC
LIMIT 1
这适用于带有硬编码task_id的独立查询。但我需要将此查询作为子查询嵌入,以获得每个任务的状态更改数量。
我无法让它发挥作用。问题是当我在SELECT查询部分设置变量时,它们就成了查询结果的一部分。子查询应该返回单个标量,但我的查询返回表 (incrementing_logic,status_changes_quantity,save_prev)我不知道sintax如何摆脱这个不需要的colomns(incrementing_logic,save_prev)。
我试过这个:
SELECT
(
SELECT
CASE WHEN (status != @prev_status AND @prev_status IS NOT NULL)
THEN @status_changes_quantity := @status_changes_quantity + 1
END as incrementing_logic,
@status_changes_quantity AS status_changes_quantity,
@prev_status := status AS save_prev
FROM task_version,
(
SELECT
@prev_status := NULL,
@status_changes_quantity := 0
) as task_version_with_additional_vars
WHERE task_id = t.id
ORDER BY status_changes_quantity DESC
LIMIT 1
) AS status_changes_quantity,
t.id AS task_id,
tv.status AS task_status
FROM task t
INNER JOIN task_version tv ON t.id = tv.task_id
显然得到了:
[Err] 1241 - Operand should contain 1 column(s)
然后我尝试将子查询表包装到另一个tmp表中以摆脱变量字段和ger标量值:
SELECT
(
SELECT
status_changes_quantity
FROM
(
SELECT
CASE WHEN (status != @prev_status AND @prev_status IS NOT NULL)
THEN @status_changes_quantity := @status_changes_quantity + 1
END as incrementing_logic,
@status_changes_quantity AS status_changes_quantity,
@prev_status := status AS save_prev
FROM task_version,
(
SELECT
@prev_status := NULL,
@status_changes_quantity := 0
) as task_version_with_additional_vars
WHERE task_id = t.id
ORDER BY status_changes_quantity DESC
LIMIT 1
) AS tmp_table
) AS status_changes_quantity,
t.id AS task_id,
tv.status AS task_status
FROM task t
INNER JOIN task_version tv ON t.id = tv.task_id
我还得到一个恐怖,即t.id现在在子查询范围内不可见:
[Err] 1054 - Unknown column 't.id' in 'where clause'
也许有人知道如何解决我的问题。要纠正我的查询或建议完全不同的算法。
提前致谢。
答案 0 :(得分:0)
我稍微修改了您的查询:
SELECT task_id, max( status_changes_quantity )
FROM (
SELECT
task_id, id,
CASE WHEN @prev_task_id <> task_id
THEN @status_changes_quantity := 0
WHEN status != @prev_status
THEN @status_changes_quantity := @status_changes_quantity + 1
ELSE @status_changes_quantity
END status_changes_quantity,
@prev_task_id := task_id,
@prev_status := status
FROM task_version,
(
SELECT
@prev_status := NULL,
@prev_task_id := null,
@status_changes_quantity := 0
) as task_version_with_additional_vars
-- WHERE task_id = 1
ORDER BY task_id, id
) q
GROUP BY task_id
ORDER BY 2 DESC
演示 - &gt; http://www.sqlfiddle.com/#!2/c9ecc/14
此查询计算所有task_id的状态更改次数,
并且也仅针对一个给定任务 - 如果您取消注释-- WHERE task_id = 1
条款。
答案 1 :(得分:0)
@kordirko非常感谢。你的纠正成功了。 Actualy,根据这篇文章http://www.xaprb.com/blog/2006/12/15/advanced-mysql-user-variable-techniques/我设法从结果集中删除变量赋值以避免使用tmp表。
所有我需要做的(如果我理解的话)是隐藏在函数GREATEST中的变量赋值,在另外的WHERE子句中,它总是渐渐变为TRUE,如:
WHERE task_id = t.id
AND GREATEST(
@var1 := if(1 = 1, 'some_value', 'alt_value'),--conditional logic instead of CASE WHEN
@var := 123 -- simple assignment
)-- this should evolute to true
所以最终版本是这样的:
SELECT
(
SELECT
max(@status_changes_quantity) AS status_changes_quantity
FROM task_version,
(
SELECT
@prev_status := NULL,
@status_changes_quantity := 0,
@prev_task_id :=0
) as task_version_with_additional_vars
WHERE GREATEST(
@status_changes_quantity := if(task_id != @prev_task_id, 0, @status_changes_quantity),
@prev_task_id := task_id,
@status_changes_quantity := if((status != @prev_status AND @prev_status IS NOT NULL), @status_changes_quantity + 1, @status_changes_quantity),
@prev_status := status
)
AND task_id = t.id
) AS status_changes_quantity,
t.id AS task_id
FROM task t
INNER JOIN task_version tv ON t.id = tv.task_id
GROUP BY t.id