KDB:选择某些事件“大约”时间的数据

时间:2014-06-24 14:24:24

标签: windows kdb

考虑一个巨大的市场数据表T.我对Status =`SSS的行特别感兴趣。

然而,除了由(从T中选择Status =`SSS)所给出的行之外,我还想选择在这些行之前和之后出现的10条记录。 (请注意,在某些情况下,这些间隔可能会重叠)。有效的方法是什么?

请注意,我尝试过类似下面的内容,它几乎崩溃了我的端口并占用了所有内存。

select from
update diff:min each abs i-(count i)# enlist (exec distinct x from select from
(update x:i from T) where Status=`SSS),where diff<10 

3 个答案:

答案 0 :(得分:1)

这是WooiKent答案中几乎没有修改的另一种解决方案。但它在时间和空间上都有所改善。

select from t where i in distinct raze (-10+til 21)+\:(exec i from t where sym=`CC)

对于WooiKent样本表:

\ts select from t where i in distinct raze (-10+til 21)+\:(exec i from t where sym=`CC)
113 77595968j

答案 1 :(得分:0)

q)n:1000000
q)t:update `g#sym from`time xasc([]time:n?.z.t;sym:n?`AA`BB`CC;side:n?`buy`sell;price:10+n?1.0;size:1000*n?10)
q)t
time         sym side price    size
-----------------------------------
00:00:00.014 BB  sell 10.40464 7000
00:00:00.052 AA  sell 10.42747 1000
00:00:00.063 BB  buy  10.9406  7000
00:00:00.085 AA  sell 10.23984 7000
00:00:00.105 CC  buy  10.06752 7000
00:00:00.127 AA  sell 10.83174 1000
00:00:00.141 AA  sell 10.29591 8000
00:00:00.167 BB  sell 10.75681 2000
00:00:00.232 CC  buy  10.56052 1000
00:00:00.234 AA  sell 10.16642 7000
00:00:00.281 BB  buy  10.58453 7000
00:00:00.284 BB  buy  10.08245 2000
00:00:00.338 AA  sell 10.4551  1000
00:00:00.455 BB  buy  10.13024 8000
00:00:00.463 CC  sell 10.43779 5000
00:00:00.477 CC  buy  10.5226  0
00:00:00.535 CC  sell 10.59109 7000
00:00:00.671 AA  sell 10.90785 4000
00:00:00.702 CC  sell 10.60891 9000
00:00:00.704 BB  buy  10.30173 8000
..
q){select from t where i in raze(til[1+2*x]-x)+/:where sym in `CC}1
time         sym side price    size
-----------------------------------
00:00:00.085 AA  sell 10.23984 7000
00:00:00.105 CC  buy  10.06752 7000
00:00:00.127 AA  sell 10.83174 1000
00:00:00.167 BB  sell 10.75681 2000
00:00:00.232 CC  buy  10.56052 1000
00:00:00.234 AA  sell 10.16642 7000
00:00:00.455 BB  buy  10.13024 8000
00:00:00.463 CC  sell 10.43779 5000
00:00:00.477 CC  buy  10.5226  0
00:00:00.535 CC  sell 10.59109 7000
00:00:00.671 AA  sell 10.90785 4000
00:00:00.702 CC  sell 10.60891 9000
00:00:00.704 BB  buy  10.30173 8000
00:00:00.716 CC  buy  10.00173 5000
00:00:00.753 BB  sell 10.04301 4000
00:00:01.188 BB  sell 10.86634 8000
00:00:01.210 CC  buy  10.0534  3000
00:00:01.231 BB  buy  10.28736 3000
00:00:01.725 AA  sell 10.25753 5000
00:00:01.783 CC  buy  10.38823 6000
q)\ts {select from t where i in raze(til[1+2*x]-x)+/:where sym in `CC}1
96 37749856
q)\ts {select from t where i in raze(til[1+2*x]-x)+/:where sym in `CC}10
195 154503136

答案 2 :(得分:0)

这是尝试解决关于交易周围-7和7秒内的记录的问题(我假设您将交易时间计算为0,并且这些组合构成了15秒,但您可以相应地更改)。

由于您正在查看范围,因此可能存在重叠,因此检查一系列范围没有意义,而是包含您需要的所有内容的范围。所以我们创建一个合并时间范围的辅助函数。请注意,它假定a)您的起点是&lt;你在每个范围内的结束点,b)你已经通过提升起点喂入了范围。这绝对可以改善

//borken up into 2 lines for stackoverflow formatting /ease of read
mergeranges:{(enlist first fw){
   (neg[d]_x),enlist @[y;0;:;(first y;first p) d:first[y] within p:last x]
   }/1_fw:flip x}

例如,如果我们有以下范围

[0, 2]
      [3, 4]
          [4, 6]

我们真的应该检查[0 2]和[3 6]

q)mergeranges (0 3 4; 2 4 6)
0 2
3 6

q)\S 1
q)t:([]time:til 100;status:100?10; px:100?1.)
q)-7 7+\:exec time from t where status=0
10 27 32 35 36 43 68 70
24 41 46 49 50 57 82 84
q)mergeranges -7 7+\:exec time from t where status=0
10 24
27 57
68 84
q)5#select from t where any time within/:mergeranges -7 7+\:exec time from t where status=0
time status px
----------------------
10   1      0.1634704
11   5      0.7766767
12   6      0.8928093
13   6      0.6203577
14   3      0.07747125