Question

我有一张这样的表：

+-------+--------------+--------------+-------------+-------------+-------------+-------------+
| Study | Point_number | Date_created | condition_A | condition_B | condition_C | condition D |
+-------+--------------+--------------+-------------+-------------+-------------+-------------+
|     1 |            1 | 01-01-2001   |           1 |           1 |           0 |           1 |
|     1 |            2 | 01-01-2001   |           0 |           1 |           1 |           0 |
|     1 |            3 | 01-01-2001   |           0 |           1 |           0 |           0 |
+-------+--------------+--------------+-------------+-------------+-------------+-------------+

条件_A，B，C和D用于将数据点分类为组。因此，这些列的每个唯一组合都是一个组。对于每个组，我想检索最后200行。

目前我有这样的事情：

select * from my_table where point_number <= 200;

为了对每个小组执行此操作，我可以这样做：

select * from my_table where point_number <= 200 where condition_A = 1 and condition_B = 1 and condition_C = 1 and condition D = 1
union all
select * from my_table where point_number <= 200 where condition_A = 1 and condition_B = 1 and condition_C = 1 and condition D = 0
union all...;

这种方法的问题在于有许多组合，并且最好使查询尽可能灵活。如何避免执行UNION ALL并让查询自动为每个组检索200行？

Answer 1

您的原始查询：

select *
from my_table
where point_number <= 200;

应该做你想做的事情 - 检索point_number小于200的值。它应该为每个组执行此操作。

如果您想在每个组中使用 200 值，那么这样的内容可能就是您真正想要的：

select t.*
from (select t.*,
             row_number() over (partition by a, b, c, d order by point_number desc) as seqnum
      from my_table
     ) t
where seqnum <= 200;

这假设point_number()正在增加，而更大的值是“更近期”。您可能希望在date_created而不是order by中使用point_number。

Answer 2

这个怎么样：

SELECT  * 
FROM    my_table 
WHERE   point_number <= 200 
AND     condition_A = 1 
AND     condition_B = 1 
AND     condition_C = 1 
AND     condition_D IN (0, 1);

Answer 3

这可以帮助您找出您需要做的事情：

with sample_data as (select 1 id, 1 ca, 0 cb from dual union all
                     select 2 id, 1 ca, 1 cb from dual union all
                     select 3 id, 1 ca, 1 cb from dual union all
                     select 4 id, 0 ca, 0 cb from dual union all
                     select 5 id, 0 ca, 1 cb from dual union all
                     select 6 id, 0 ca, 1 cb from dual union all
                     select 7 id, 0 ca, 0 cb from dual union all
                     select 8 id, 1 ca, 0 cb from dual union all
                     select 9 id, 1 ca, 1 cb from dual union all
                     select 10 id, 0 ca, 1 cb from dual union all
                     select 11 id, 0 ca, 0 cb from dual union all
                     select 12 id, 1 ca, 0 cb from dual union all
                     select 13 id, 1 ca, 0 cb from dual union all
                     select 14 id, 0 ca, 1 cb from dual union all
                     select 15 id, 0 ca, 0 cb from dual union all
                     select 16 id, 1 ca, 1 cb from dual union all
                     select 17 id, 0 ca, 0 cb from dual)
select id,
       ca,
       cb,
       row_number() over (partition by ca, cb order by id) rn
from   sample_data;

        ID         CA         CB         RN
---------- ---------- ---------- ----------
         4          0          0          1
         7          0          0          2
        11          0          0          3
        15          0          0          4
        17          0          0          5
         5          0          1          1
         6          0          1          2
        10          0          1          3
        14          0          1          4
         1          1          0          1
         8          1          0          2
        12          1          0          3
        13          1          0          4
         2          1          1          1
         3          1          1          2
         9          1          1          3
        16          1          1          4

基本上，您需要找出每个组中每行的行号 - 分析函数的作业，特别是row_number()分析函数。

如果您以前没有遇到过分析函数，那么它们基本上与聚合函数类似（因此您可以在各组之间找到结果，也就是＆＃34;按＆＃34分区;）而不会折叠行。如果您还不熟悉它，我建议您对此进行一些研究！

无论如何，一旦你分配了你的行号，你就可以在sql周围抛出一个外部查询来过滤行号，例如：

with sample_data as (select 1 id, 1 ca, 0 cb from dual union all
                     select 2 id, 1 ca, 1 cb from dual union all
                     select 3 id, 1 ca, 1 cb from dual union all
                     select 4 id, 0 ca, 0 cb from dual union all
                     select 5 id, 0 ca, 1 cb from dual union all
                     select 6 id, 0 ca, 1 cb from dual union all
                     select 7 id, 0 ca, 0 cb from dual union all
                     select 8 id, 1 ca, 0 cb from dual union all
                     select 9 id, 1 ca, 1 cb from dual union all
                     select 10 id, 0 ca, 1 cb from dual union all
                     select 11 id, 0 ca, 0 cb from dual union all
                     select 12 id, 1 ca, 0 cb from dual union all
                     select 13 id, 1 ca, 0 cb from dual union all
                     select 14 id, 0 ca, 1 cb from dual union all
                     select 15 id, 0 ca, 0 cb from dual union all
                     select 16 id, 1 ca, 1 cb from dual union all
                     select 17 id, 0 ca, 0 cb from dual),
         results as (select id,
                            ca,
                            cb,
                            row_number() over (partition by ca, cb order by id) rn
                     from   sample_data)
select *
from   results
where  rn <= 3;

        ID         CA         CB         RN
---------- ---------- ---------- ----------
         4          0          0          1
         7          0          0          2
        11          0          0          3
         5          0          1          1
         6          0          1          2
        10          0          1          3
         1          1          0          1
         8          1          0          2
        12          1          0          3
         2          1          1          1
         3          1          1          2
         9          1          1          3

有没有办法避免这个Oracle查询中的联合？

3 个答案: