我有桌子[访问]。当order_number为null时,我需要获取user_id分组的行和visit_duration_seconds的总和,例如,对于用户[2875636],我将获得:61 + 151 + 33 + 13。每行应包括它之前的行的总和 Plz还在下面的预期结果中引用RESULT列
user_id starttime visit_duration_seconds order_number 2875636 2013-01-16 18:03:50 61 2875636 2013-01-16 18:08:18 151 2875636 2013-01-16 18:15:43 33 2875636 2013-01-16 18:16:37 13 2875636 2013-01-16 18:18:01 2011 10177888 2875636 2013-01-16 18:24:35 1172 10177884 2875636 2013-01-16 18:32:03 4731 2875636 2013-01-16 18:33:27 407 2875636 2013-01-16 18:37:29 74 2875636 2013-01-16 18:48:55 80 2875636 2013-01-16 19:05:00 1955 2875636 2013-01-16 19:14:12 326 2875636 2013-01-16 19:23:39 972 2875636 2013-01-16 19:33:05 5440 2875636 2013-01-16 19:35:48 43 2875636 2013-01-16 19:41:10 66 2875636 2013-01-16 19:42:03 100 2875636 2013-01-16 19:42:12 2414 10177940 2875636 2013-01-16 19:49:05 432 10177925 2875636 2013-01-16 19:50:19 183 2875636 2013-01-16 19:52:46 2061 2875636 2013-01-16 19:52:53 400 2875636 2013-01-16 20:00:47 338 2875636 2013-01-16 20:08:58 216 2875636 2013-01-16 20:14:21 58 2875636 2013-01-16 20:14:26 196 2875636 2013-01-16 20:19:14 2189 2875636 2013-01-16 20:21:29 424 2875636 2013-01-16 20:24:42 999 2875636 2013-01-16 21:01:39 1810 2875636 2013-01-16 21:02:54 525 2875636 2013-01-16 21:10:06 27 2875636 2013-01-16 21:12:08 282 2875636 2013-01-16 21:51:02 6 2875636 2013-01-16 22:18:34 173 2875636 2013-01-16 23:02:58 318 2875636 2013-01-16 23:45:37 207 3018868 2013-01-16 16:01:45 18 3018868 2013-01-16 16:16:45 39 3018868 2013-01-16 16:22:55 656 3018868 2013-01-16 16:25:54 1852 3018868 2013-01-16 16:29:23 688 3018868 2013-01-16 16:47:26 2258 10177846 3018868 2013-01-16 16:57:41 572 3018868 2013-01-16 17:06:47 1431 3018868 2013-01-16 17:18:32 29 3018868 2013-01-16 17:21:57 45 3018868 2013-01-16 17:29:23 16 3018868 2013-01-16 17:36:47 490
预期结果
user_id starttime visit_duration_seconds order_number RESULT
2875636 2013-01-16 18:03:50 61 61
2875636 2013-01-16 18:08:18 151 212
2875636 2013-01-16 18:15:43 33 245
2875636 2013-01-16 18:16:37 13 258
2875636 2013-01-16 18:18:01 2011 10177888 0
2875636 2013-01-16 18:24:35 1172 10177884 0
2875636 2013-01-16 18:32:03 4731 4731
2875636 2013-01-16 18:33:27 407 5138
2875636 2013-01-16 18:37:29 74 5212
2875636 2013-01-16 18:48:55 80 ...
2875636 2013-01-16 19:05:00 1955 ...
2875636 2013-01-16 19:14:12 326 ...
2875636 2013-01-16 19:23:39 972
2875636 2013-01-16 19:33:05 5440
2875636 2013-01-16 19:35:48 43
2875636 2013-01-16 19:41:10 66
2875636 2013-01-16 19:42:03 100
2875636 2013-01-16 19:42:12 2414 10177940
2875636 2013-01-16 19:49:05 432 10177925
2875636 2013-01-16 19:50:19 183
2875636 2013-01-16 19:52:46 2061
2875636 2013-01-16 19:52:53 400
2875636 2013-01-16 20:00:47 338
2875636 2013-01-16 20:08:58 216
2875636 2013-01-16 20:14:21 58
2875636 2013-01-16 20:14:26 196
2875636 2013-01-16 20:19:14 2189
2875636 2013-01-16 20:21:29 424
2875636 2013-01-16 20:24:42 999
2875636 2013-01-16 21:01:39 1810
2875636 2013-01-16 21:02:54 525
2875636 2013-01-16 21:10:06 27
2875636 2013-01-16 21:12:08 282
2875636 2013-01-16 21:51:02 6
2875636 2013-01-16 22:18:34 173
2875636 2013-01-16 23:02:58 318
2875636 2013-01-16 23:45:37 207
3018868 2013-01-16 16:01:45 18
3018868 2013-01-16 16:16:45 39
3018868 2013-01-16 16:22:55 656
3018868 2013-01-16 16:25:54 1852
3018868 2013-01-16 16:29:23 688
3018868 2013-01-16 16:47:26 2258 10177846
3018868 2013-01-16 16:57:41 572
3018868 2013-01-16 17:06:47 1431
3018868 2013-01-16 17:18:32 29
3018868 2013-01-16 17:21:57 45
3018868 2013-01-16 17:29:23 16
3018868 2013-01-16 17:36:47 490
答案 0 :(得分:2)
您可以使用MySQL用户变量来模拟分析函数。 (还有一些其他方法,比如使用半连接或使用相关子查询。如果您觉得它们可能更合适,我也可以为这些提供解决方案。)
要模拟“运行总计”分析函数,请尝试以下方法:
SELECT t.user_id
, t.starttime
, t.order_number
, IF(t.order_number IS NOT NULL,
@tot_dur := 0,
@tot_dur := @tot_dur + t.visit_duration_seconds) AS tot_dur
FROM visit t
JOIN (SELECT @tot_dur := 0) d
ORDER BY t.user_id, t.start_time
这里的“技巧”是使用IF函数来测试order_number
是否为空。当它为null时,我们将持续时间值添加到变量中,否则,我们将变量设置为零。
我们使用内联视图(别名为d
,以确保@tot_dur变量初始化为零。
注意:请注意使用像这样的MySQL用户变量。在上面的SELECT语句中,SELECT列表中的变量赋值发生在ORDER BY之后,因此我们可以获得确定性行为。
该查询不处理user_id中的“break”。为此,我们将需要上一行中user_id的值。我们可以在另一个用户变量中保留它。操作的顺序是确定性的,在覆盖前一行的user_id之前,我们需要注意进行累积。
我们需要重新排序列,以便在tot_dur之后显示user_id(或包含user_id列的第二个副本)
SELECT t.user_id
, t.starttime
, t.order_number
, IF(t.order_number IS NULL,
@tot_dur := IF(@prev_user_id = t.user_id,@tot_dur,0) + t.visit_duration_seconds,
@tot_dur := 0
) AS tot_dur
, @prev_user_id := t.user_id AS prev_user_id
FROM visit t
JOIN (SELECT @tot_dur := 0, @prev_user_id := NULL) d
ORDER BY t.user_id, t.start_time
user_id
和prev_user_id
列中返回的值相同。可以删除“额外”列,或者可以通过将查询(作为内联视图)包装在另一个查询中来重新排序列,尽管这会带来性能成本:
SELECT v.user_id
, v.starttime
, v.order_number
, v.tot_dur
FROM (SELECT t.starttime
, t.order_number
, IF(t.order_number IS NULL,
@tot_dur := IF(@prev_user_id = t.user_id,@tot_dur,0) + t.visit_duration_seconds,
@tot_dur := 0
) AS tot_dur
, @prev_user_id := t.user_id AS user_id
FROM visit t
JOIN (SELECT @tot_dur := 0, @prev_user_id := NULL) d
ORDER BY t.user_id, t.start_time
) v
该查询表明MySQL可以返回指定的结果集。但是为了获得最佳性能,我们只想在内联视图中运行查询(别名为v
),并在客户端处理列的重新排序(将user_id列放在第一位),检索行。
另外两种常见方法是使用半连接,并使用相关子查询,尽管这些方法在处理大型集合时可能会占用更多资源。