SQL - 组合不完整

时间:2011-12-08 15:19:06

标签: sql oracle

我正在使用Oracle 10g。我有一个包含许多不同类型字段的表。这些字段包含特定网站在特定日期对特定事物进行的观察。

所以:

ItemID, Date, Observation1, Observation2, Observation3...

每条记录中大约有40个观察结果。此时无法更改表结构。

不幸的是,并非所有的观察都已填充(无论是意外还是因为该网站无法进行录制)。我需要将有关特定项目的所有记录合并到查询中的单个记录中,使其尽可能完整。

执行此操作的简单方法类似于

SELECT
    ItemID,
    MAX(Date),
    MAX(Observation1),
    MAX(Observation2)
    etc.
FROM
    Table
GROUP BY
    ItemID

但理想情况下,我希望选择最新的观测结果,而不是最大/最小值。我可以通过编写

形式的子查询来做到这一点
SELECT
    ItemID,
    ObservationX,
    ROW_NUMBER() OVER (PARTITION BY ItemID ORDER BY Date DESC) ROWNUMBER
FROM
    Table
WHERE
    ObservationX IS NOT NULL

将所有ROWNUMBER 1连接在一起以获得ItemID,但由于字段数量需要40个子查询。

我的问题是,是否有一种更为简洁的方法可以解决这个问题。

2 个答案:

答案 0 :(得分:3)

创建表格和样本日期

SQL> create table observation(
  2    item_id number,
  3    dt      date,
  4    val1    number,
  5    val2    number );

Table created.

SQL> insert into observation values( 1, date '2011-12-01', 1, null );

1 row created.

SQL> insert into observation values( 1, date '2011-12-02', null, 2 );

1 row created.

SQL> insert into observation values( 1, date '2011-12-03', 3, null );

1 row created.

SQL> insert into observation values( 2, date '2011-12-01', 4, null );

1 row created.

SQL> insert into observation values( 2, date '2011-12-02', 5, 6 );

1 row created.

然后使用KEEP聚合函数上的MAX子句和ORDER BY,在最后放置具有NULL观察值的行。您在ORDER BY中使用的任何日期都需要早于表格中最早的真实观察。

SQL> ed
Wrote file afiedt.buf

  1  select item_id,
  2         max(val1) keep( dense_rank last
  3                              order by (case when val1 is not null
  4                                             then dt
  5                                             else date '1900-01-01'
  6                                          end) ) val1,
  7         max(val2) keep( dense_rank last
  8                              order by (case when val2 is not null
  9                                             then dt
 10                                             else date '1900-01-01'
 11                                          end) ) val2
 12    from observation
 13*  group by item_id
SQL> /

   ITEM_ID       VAL1       VAL2
---------- ---------- ----------
         1          3          2
         2          5          6

我怀疑有一个更优雅的解决方案来忽略NULL值,而不是将CASE语句添加到ORDER BY,但CASE可以完成工作。

答案 1 :(得分:0)

我不知道oracle中的命令,但在sql中你可以使用一些

首先使用数据透视表包含连续数字0,1,2 ......

我不确定但是在oracle中,函数“isnull”是“NVL”

  select items.ItemId,
    case p.i = 0 then observation1 else '' end as observation1,
    case p.i = 0 then observation1 else '' end as observation2,
    case p.i = 0 then observation1 else '' end as observation3,
    ...
    case p.i = 39 then observation4 else '' as observation40
  from (
    select items.ItemId
    from table as items
    where items.item = _paramerter_for_retrive_only_one_item /* select one item o more item where you filter items here*/
    group by items.ItemId) itemgroup
    left join 
    (
      select 
       items.ItemId, 
       p.i,
       isnull(    max  (  case p.i = 0 then observation1 else '' end ), '' ) as observation1,
       isnull(    max  (  case p.i = 1 then observation2 else '' end ), '' ) as observation2,
       isnull(    max  (  case p.i = 2 then observation3 else '' end), '' ) as observation3,
       ...
       isnull(    max  (  case p.i = 39 then observation4), '' ) as observation40,
      from 
       (select i from pivot where id < 40 /*you number of columns of observations, that attach one index*/
       )
       as p
       cross join table as items

       lef join table as itemcombinations
       on item.itemid = itemcombinations.itemid

       where items.item = _paramerter_for_retrive_only_one_item /* select one item o more item where you filter items here*/
        and (p.i = 0 and not itemcombinations.observation1 is null) /* column 1 */
        and (p.i = 1 and not itemcombinations.observation2 is null) /* column 2 */
        and (p.i = 2 and not itemcombinations.observation3 is null) /* column 3 */
        .... 
        and (p.i = 39 and not itemcombinations.observation3 is null) /* column 39 */
       group by p.i, items.ItemId
     ) as itemsimplified
     on itemsimplified.ItemId = itemgroup.itemId

  group by itemgroup.itemId

关于数据透视表

创建一个数据透视表,看看那个

数据透视表架构

name:pivot columns:{i:datatype int}

如何填充

create foo table

schema foo

name: foo column: value datatype varchar

insert into foo
values('0'),
values('1'),
values('2'),
values('3'),
values('4'),
values('5'),
values('6'),
values('7'),
values('8'),
values('9');

/* insert 100 values */
insert into pivot
select concat(a.value, a.value) /* mysql */
   a.value + a.value /* sql server */
   a.value | a.value /* Oracle im not sure about that sintax */
from foo a, foo b

/* insert 1000 values */
insert into pivot
select concat(a.value, b.value, c.value) /* mysql  */
a.value + b.value + c.value /* sql server */
a.value | b.value | c.value /* Oracle im not sure about that sintax */
from foo a, foo b, foo c 
关于数据透视表的想法可以参考“Transact-SQL Cookbook作者:Jonathan Gennick,Ales Spetic”

我必须承认上述解决方案(由Justin Cave提供)更简单易懂,但这是另一个不错的选择

最后就像你说你解决了