kdb +等同于SQL的rank()和density_rank()

时间:2019-04-10 20:13:55

标签: kdb

每个人都必须在kdb +中模拟SQL的rank(),deny_rank()和row_number()的结果吗?这是一些SQL来演示功能。如果有人在下面有特定的解决方案,也许我可以将其归纳为支持多个分区和按列排序,然后在此网站上发布。

CREATE TABLE student(course VARCHAR(10), mark int, name varchar(10));

INSERT INTO student VALUES  
('Maths', 60, 'Thulile'),
('Maths', 60, 'Pritha'),
('Maths', 70, 'Voitto'),
('Maths', 55, 'Chun'),
('Biology', 60, 'Bilal'),
('Biology', 70, 'Roger');

SELECT
 RANK() OVER (PARTITION BY course ORDER BY mark DESC) AS rank,
 DENSE_RANK() OVER (PARTITION BY course ORDER BY mark DESC) AS dense_rank,
 ROW_NUMBER() OVER (PARTITION BY course ORDER BY mark DESC) AS row_num,
 course, mark, name 
FROM student ORDER BY course, mark DESC;

+------+------------+---------+---------+------+---------+
| rank | dense_rank | row_num | course  | mark | name    |
+------+------------+---------+---------+------+---------+
|    1 |          1 |       1 | Biology |   70 | Roger   |
|    2 |          2 |       2 | Biology |   60 | Bilal   |
|    1 |          1 |       1 | Maths   |   70 | Voitto  |
|    2 |          2 |       2 | Maths   |   60 | Thulile |
|    2 |          2 |       3 | Maths   |   60 | Pritha  |
|    4 |          3 |       4 | Maths   |   55 | Chun    |
+------+------------+---------+---------+------+---------+

以下是一些kdb +来生成等效的学生表:

student:([] course:`Maths`Maths`Maths`Maths`Biology`Biology; 
   mark:60 60 70 55 60 70; 
   name:`Thulile`Pritha`Voitto`Chun`Bilal`Roger)

谢谢!

3 个答案:

答案 0 :(得分:3)

如果最初按课程对表格进行排序并标记:

student:`course xasc `mark xdesc ([] course:`Maths`Maths`Maths`Maths`Biology`Biology;mark:60 60 70 55 60 70;name:`Thulile`Pritha`Voitto`Chun`Bilal`Roger)
course  mark name
--------------------
Biology 70   Roger
Biology 60   Bilal
Maths   70   Voitto
Maths   60   Thulile
Maths   60   Pritha
Maths   55   Chun

然后,您可以使用类似以下的内容来实现输出:

update rank_sql:first row_num by course,mark from update dense_rank:1+where count each (where differ mark)cut mark,row_num:1+rank i by course from  student

course  mark name    dense_rank row_num rank_sql
------------------------------------------------
Biology 70   Roger   1          1       1
Biology 60   Bilal   2          2       2
Maths   70   Voitto  1          1       1
Maths   60   Thulile 2          2       2
Maths   60   Pritha  2          3       2
Maths   55   Chun    3          4       4

如果您想进一步阅读,请使用rankvirtual index column

答案 1 :(得分:1)

对于按目标列排序的表:

q) dense_sql:{sums differ x}
q) rank_sql:{raze #'[(1_deltas b),1;b:1+where differ x]}
q) row_sql:{1+til count x}

q) student:`course xasc `mark xdesc ([] course:`Maths`Maths`Maths`Maths`Biology`Biology;mark:60 60 70 55 60 70;name:`Thulile`Pritha`Voitto`Chun`Bilal`Roger)

q)update row_num:row_sql mark,rank_s:rank_sql mark,dense_s:dense_sql mark by course from student

答案 2 :(得分:1)

到目前为止,我可以想到: 注意:kdb中的rank函数适用于asc列表,因此我创建了以下函数。 我不会对表进行xdesc操作,因为我只能使用向量列对其进行描述

q)denseF
{((desc distinct x)?x)+1}
q)rankF
{((desc x)?x)+1}

q)update dense_rank:denseF mark,rank_rank:rankF mark,row_num:1+rank i by course from student

路线标记名称density_rank rank_rank row_num

数学60 Thulile 2 2 1
数学60 Pritha 2 2 2
数学70 Voitto 1 1 3
数学55 Chun 3 4 4
生物学60 Bilal 2 2 1
生物学70罗杰1 1 2