我正在寻找在商店数据集中的交易(连续的?),这种趋势遵循的趋势是尽管一天之内有一些先前的取消,但他们最终还是完成了交易。
有效的批处理交易必须符合一组条件。
X
的取消数量,但1
完成。查询应保留所有符合上述条件的批次。最终表应在列batch
中包含批次的总运行量,以区分它们。
初始表
shop amount status date
------------------------------
A 1234 Cancelled 20101010
A 1234 Cancelled 20101010
A 1234 Completed 20101010
A 1234 Cancelled 20101010
A 1234 Completed 20101011
A 1000 Completed 20101011
B 100 Cancelled 20101011
B 100 Cancelled 20101011
B 4321 Cancelled 20101011
B 4321 Cancelled 20101011
C 333 Cancelled 20101012
C 333 Completed 20101012
C 333 Completed 20101012
D 111 Cancelled 20101013
D 155 Cancelled 20101013
D 111 Completed 20101013
D 155 Completed 20101013
按天划分:
shop amount status date
------------------------------
A 1234 Cancelled 20101010
A 1234 Cancelled 20101010
A 1234 Completed 20101010
A 1234 Cancelled 20101010
------------------------------
A 1234 Completed 20101011
A 1000 Completed 20101011
B 100 Cancelled 20101011
B 100 Cancelled 20101011
B 4321 Cancelled 20101011
B 4321 Cancelled 20101011
------------------------------
C 333 Cancelled 20101012
C 333 Completed 20101012
C 333 Completed 20101012
------------------------------
D 111 Cancelled 20101013
D 155 Cancelled 20101013
D 111 Completed 20101013
D 155 Completed 20101013
结果表:
shop amount status date batch
-------------------------------------
A 1234 Cancelled 20101010 1
A 1234 Cancelled 20101010 1
A 1234 Completed 20101010 1
-------------------------------------
A 1234 Completed 20101011 2
A 1000 Completed 20101011 3
-------------------------------------
C 333 Cancelled 20101012 4
C 333 Completed 20101012 4
C 333 Completed 20101012 5
-------------------------------------
D 111 Cancelled 20101013 6
D 155 Cancelled 20101013 7
D 111 Completed 20101013 6
D 155 Completed 20101013 7
表格查询:
([] shop:`A`A`A`A`A`A`B`B`B`B`C`C`C`D`D`D`D; amount: 1234 1234 1234 1234 1234 1000 100 100 4321 4321 333 333 333 111 155 111 155; status:`Cancelled`Cancelled`Completed`Cancelled`Completed`Completed`Cancelled`Cancelled`Cancelled`Cancelled`Cancelled`Completed`Completed`Cancelled`Cancelled`Completed`Completed; date: `20101010`20101010`20101010`20101010`20101011`20101011`20101011`20101011`20101011`20101011`20101012`20101012`20101012`20101013`20101013`20101013`20101013)
说明:
第一天, A 进行了4笔交易。前三个批次具有相同的数量, [已取消->已取消->已完成] 是相同的。最后一天的交易会在一天结束时被忽略。
第二天, A 进行一笔金额为1234
的交易,但不将前一天的交易作为其批次的一部分。一个完成1000
的另一笔交易。 B 进行了四笔交易,但由于被 a)取消或b)十次幂而无法跟踪。
第三天, C 进行三笔相同金额的交易。这是两个批处理,因为第一个取消和完成构成了初始批处理,而最终完成的交易本身就是一个批处理。
在第四天, D 进行了四笔交易并分两批进行。请注意,此处的交易不是连续的,因为有两个金额不同的已取消交易,但都在将来完成。
表格按时间戳和日期排序,即23:59:59到00:00:00。查询不需要是单行查询,可以是写入任何临时表/变量等的多行查询。
此外,如果有一种方法可以获取每批取消的交易数,这将很有帮助。
答案 0 :(得分:6)
因此,首先计算完成的批次数量。
q)n:count select from tab where status=`Completed
然后使用以下查询将批号分配给每个“已完成”行
q)btab:update batch:1+til n from tab where status=`Completed
q)btab
shop amount status date batch
------------------------------------
A 1234 Cancelled 20101010
A 1234 Cancelled 20101010
A 1234 Completed 20101010 1
A 1234 Cancelled 20101010
A 1234 Completed 20101011 2
A 1000 Completed 20101011 3
B 100 Cancelled 20101011
B 100 Cancelled 20101011
B 4321 Cancelled 20101011
B 4321 Cancelled 20101011
C 333 Cancelled 20101012
C 333 Completed 20101012 4
C 333 Completed 20101012 5
D 111 Cancelled 20101013
D 155 Cancelled 20101013
D 111 Completed 20101013 6
D 155 Completed 20101013 7
然后反转表格以按日期,商店和金额填充空值,然后反转并删除10的幂的所有取消(使用与特里林奇相同的逻辑)
q)ftab:reverse update fills batch by date,shop,amount from reverse btab where not (status=`Cancelled)&{x=`int$x}10 xlog amount
q)ftab
shop amount status date batch
------------------------------------
A 1234 Cancelled 20101010 1
A 1234 Cancelled 20101010 1
A 1234 Completed 20101010 1
A 1234 Cancelled 20101010
A 1234 Completed 20101011 2
A 1000 Completed 20101011 3
B 100 Cancelled 20101011
B 100 Cancelled 20101011
B 4321 Cancelled 20101011
B 4321 Cancelled 20101011
C 333 Cancelled 20101012 4
C 333 Completed 20101012 4
C 333 Completed 20101012 5
D 111 Cancelled 20101013 6
D 155 Cancelled 20101013 7
D 111 Completed 20101013 6
D 155 Completed 20101013 7
然后从表中选择并提取具有批号的数据
q)stab:select from ftab where batch<>0N
q)stab
shop amount status date batch
------------------------------------
A 1234 Cancelled 20101010 1
A 1234 Cancelled 20101010 1
A 1234 Completed 20101010 1
A 1234 Completed 20101011 2
A 1000 Completed 20101011 3
C 333 Cancelled 20101012 4
C 333 Completed 20101012 4
C 333 Completed 20101012 5
D 111 Cancelled 20101013 6
D 155 Cancelled 20101013 7
D 111 Completed 20101013 6
D 155 Completed 20101013 7
q)
最后这是一个查询,以获取每批次的取消数量
q)select numberOfCancellations:-1+count batch by batch from stab
batch| numberOfCancellations
-----| ---------------------
1 | 2
2 | 0
3 | 0
4 | 1
5 | 0
6 | 1
7 | 1
答案 1 :(得分:1)
这不是最终查询,至少是一个起点:
q)select from tab where not (status=`Cancelled)&{x=`int$x}10 xlog amount, ({raze(reverse maxs reverse@)each`Completed=x[`status] group x`amount};([]amount;status)) fby ([]date;shop)
shop amount status date
------------------------------
A 1234 Cancelled 20101010
A 1234 Cancelled 20101010
A 1234 Completed 20101010
A 1234 Completed 20101011
A 1000 Completed 20101011
C 333 Cancelled 20101012
C 333 Completed 20101012
C 333 Completed 20101012
D 111 Cancelled 20101013
D 155 Cancelled 20101013
D 111 Completed 20101013
D 155 Completed 20101013
批处理逻辑可以通过后续查询完成