我有一张这样的表:
+-------+--------------+--------------+-------------+-------------+-------------+-------------+
| Study | Point_number | Date_created | condition_A | condition_B | condition_C | condition D |
+-------+--------------+--------------+-------------+-------------+-------------+-------------+
| 1 | 1 | 01-01-2001 | 1 | 1 | 0 | 1 |
| 1 | 2 | 01-01-2001 | 0 | 1 | 1 | 0 |
| 1 | 3 | 01-01-2001 | 0 | 1 | 0 | 0 |
+-------+--------------+--------------+-------------+-------------+-------------+-------------+
条件_A,B,C和D用于将数据点分类为组。因此,这些列的每个唯一组合都是一个组。对于每个组,我想检索最后200行。
目前我有这样的事情:
select * from my_table where point_number <= 200;
为了对每个小组执行此操作,我可以这样做:
select * from my_table where point_number <= 200 where condition_A = 1 and condition_B = 1 and condition_C = 1 and condition D = 1
union all
select * from my_table where point_number <= 200 where condition_A = 1 and condition_B = 1 and condition_C = 1 and condition D = 0
union all...;
这种方法的问题在于有许多组合,并且最好使查询尽可能灵活。如何避免执行UNION ALL并让查询自动为每个组检索200行?
答案 0 :(得分:3)
您的原始查询:
select *
from my_table
where point_number <= 200;
应该做你想做的事情 - 检索point_number
小于200的值。它应该为每个组执行此操作。
如果您想在每个组中使用 200 值,那么这样的内容可能就是您真正想要的:
select t.*
from (select t.*,
row_number() over (partition by a, b, c, d order by point_number desc) as seqnum
from my_table
) t
where seqnum <= 200;
这假设point_number()
正在增加,而更大的值是“更近期”。您可能希望在date_created
而不是order by
中使用point_number
。
答案 1 :(得分:1)
这个怎么样:
SELECT *
FROM my_table
WHERE point_number <= 200
AND condition_A = 1
AND condition_B = 1
AND condition_C = 1
AND condition_D IN (0, 1);
答案 2 :(得分:1)
这可以帮助您找出您需要做的事情:
with sample_data as (select 1 id, 1 ca, 0 cb from dual union all
select 2 id, 1 ca, 1 cb from dual union all
select 3 id, 1 ca, 1 cb from dual union all
select 4 id, 0 ca, 0 cb from dual union all
select 5 id, 0 ca, 1 cb from dual union all
select 6 id, 0 ca, 1 cb from dual union all
select 7 id, 0 ca, 0 cb from dual union all
select 8 id, 1 ca, 0 cb from dual union all
select 9 id, 1 ca, 1 cb from dual union all
select 10 id, 0 ca, 1 cb from dual union all
select 11 id, 0 ca, 0 cb from dual union all
select 12 id, 1 ca, 0 cb from dual union all
select 13 id, 1 ca, 0 cb from dual union all
select 14 id, 0 ca, 1 cb from dual union all
select 15 id, 0 ca, 0 cb from dual union all
select 16 id, 1 ca, 1 cb from dual union all
select 17 id, 0 ca, 0 cb from dual)
select id,
ca,
cb,
row_number() over (partition by ca, cb order by id) rn
from sample_data;
ID CA CB RN
---------- ---------- ---------- ----------
4 0 0 1
7 0 0 2
11 0 0 3
15 0 0 4
17 0 0 5
5 0 1 1
6 0 1 2
10 0 1 3
14 0 1 4
1 1 0 1
8 1 0 2
12 1 0 3
13 1 0 4
2 1 1 1
3 1 1 2
9 1 1 3
16 1 1 4
基本上,您需要找出每个组中每行的行号 - 分析函数的作业,特别是row_number()
分析函数。
如果您以前没有遇到过分析函数,那么它们基本上与聚合函数类似(因此您可以在各组之间找到结果,也就是&#34;按&#34分区;)而不会折叠行。如果您还不熟悉它,我建议您对此进行一些研究!
无论如何,一旦你分配了你的行号,你就可以在sql周围抛出一个外部查询来过滤行号,例如:
with sample_data as (select 1 id, 1 ca, 0 cb from dual union all
select 2 id, 1 ca, 1 cb from dual union all
select 3 id, 1 ca, 1 cb from dual union all
select 4 id, 0 ca, 0 cb from dual union all
select 5 id, 0 ca, 1 cb from dual union all
select 6 id, 0 ca, 1 cb from dual union all
select 7 id, 0 ca, 0 cb from dual union all
select 8 id, 1 ca, 0 cb from dual union all
select 9 id, 1 ca, 1 cb from dual union all
select 10 id, 0 ca, 1 cb from dual union all
select 11 id, 0 ca, 0 cb from dual union all
select 12 id, 1 ca, 0 cb from dual union all
select 13 id, 1 ca, 0 cb from dual union all
select 14 id, 0 ca, 1 cb from dual union all
select 15 id, 0 ca, 0 cb from dual union all
select 16 id, 1 ca, 1 cb from dual union all
select 17 id, 0 ca, 0 cb from dual),
results as (select id,
ca,
cb,
row_number() over (partition by ca, cb order by id) rn
from sample_data)
select *
from results
where rn <= 3;
ID CA CB RN
---------- ---------- ---------- ----------
4 0 0 1
7 0 0 2
11 0 0 3
5 0 1 1
6 0 1 2
10 0 1 3
1 1 0 1
8 1 0 2
12 1 0 3
2 1 1 1
3 1 1 2
9 1 1 3