一个只允许一个唯一输入的聚合函数

时间:2010-12-11 11:27:15

标签: sql mysql sql-server oracle postgresql

我经常发现自己在group by子句中添加表达式,我确信这些表达式是唯一的。它有时候证明我错了 - 因为我的SQL中的错误或错误的假设,并且该表达式并不是真正独特的。

在很多情况下,我宁愿这会产生SQL错误,而不是默默地扩展我的结果集,有时非常巧妙。

我希望能够做到这样的事情:

select product_id, unique description from product group by product_id

但显然我自己也无法实现 - 但在某些数据库中,用户定义的聚合可以实现几乎简洁的东西。

在所有版本的SQL中,只允许一个唯一输入值的特殊聚合是否通常有用?如果是这样,现在大多数数据库都可以实现这样的事情吗? null值应与其他任何值一样考虑 - 与内置聚合avg通常的工作方式不同。 (我已经为postgres和Oracle添加了实现此方法的答案。)

以下示例旨在说明如何使用聚合,但这是一个简单的情况,很明显哪些表达式应该是唯一的。实际使用更可能是在更大的查询中,更容易对唯一性做出错误的假设

表:

 product_id | description
------------+-------------
          1 | anvil
          2 | brick
          3 | clay
          4 | door

 sale_id | product_id |  cost
---------+------------+---------
       1 |          1 | £100.00
       2 |          1 | £101.00
       3 |          1 | £102.00
       4 |          2 |   £3.00
       5 |          2 |   £3.00
       6 |          2 |   £3.00
       7 |          3 |  £24.00
       8 |          3 |  £25.00

查询:

> select * from product join sale using (product_id);

 product_id | description | sale_id |  cost
------------+-------------+---------+---------
          1 | anvil       |       1 | £100.00
          1 | anvil       |       2 | £101.00
          1 | anvil       |       3 | £102.00
          2 | brick       |       4 |   £3.00
          2 | brick       |       5 |   £3.00
          2 | brick       |       6 |   £3.00
          3 | clay        |       7 |  £24.00
          3 | clay        |       8 |  £25.00

> select product_id, description, sum(cost) 
  from product join sale using (product_id) 
  group by product_id, description;

 product_id | description |   sum
------------+-------------+---------
          2 | brick       |   £9.00
          1 | anvil       | £303.00
          3 | clay        |  £49.00

> select product_id, solo(description), sum(cost) 
  from product join sale using (product_id) 
  group by product_id;

 product_id | solo  |   sum
------------+-------+---------
          1 | anvil | £303.00
          3 | clay  |  £49.00
          2 | brick |   £9.00

错误案例:

> select solo(description) from product;
ERROR:  This aggregate only allows one unique input

4 个答案:

答案 0 :(得分:7)

ORACLE解决方案

select product_id, 
       case when min(description) != max(description) then to_char(1/0) 
            else min(description) end description, 
       sum(cost) 
  from product join sale using (product_id) 
  group by product_id;

而不是to_char(1/0)[引发DIVIDE_BY_ZERO错误),你可以使用一个简单的函数

CREATE OR REPLACE FUNCTION solo (i_min IN VARCHAR2, i_max IN VARCHAR2) 
RETURN VARCHAR2 IS
BEGIN
  IF i_min != i_max THEN
    RAISE_APPLICATION_ERROR(-20001, 'Non-unique value specified');
  ELSE
    RETURN i_min;
  END;
END;
/
select product_id, 
       solo(min(description),max(description)) end description, 
       sum(cost) 
from product join sale using (product_id) 
group by product_id;

您可以使用用户定义的聚合,但我担心在SQL和PL / SQL之间切换会对性能产生影响。

答案 1 :(得分:3)

以下是我对postgres的实现(已编辑为将null视为唯一值):

create function solo_sfunc(inout anyarray, anyelement) 
       language plpgsql immutable as $$
begin
  if $1 is null then
    $1[1] := $2;
  else
    if ($1[1] is not null and $2 is null) 
         or ($1[1] is null and $2 is not null) 
         or ($1[1]!=$2) then 
      raise exception 'This aggregate only allows one unique input'; 
    end if;
  end if;
  return;
end;$$;

create function solo_ffunc(anyarray) returns anyelement 
       language plpgsql immutable as $$
begin
  return $1[1];
end;$$;

create aggregate solo(anyelement)
                     (sfunc=solo_sfunc, stype=anyarray, ffunc=solo_ffunc);

用于测试的示例表:

create table product(product_id integer primary key, description text);

insert into product(product_id, description)
values (1, 'anvil'), (2, 'brick'), (3, 'clay'), (4, 'door');

create table sale( sale_id serial primary key, 
                   product_id integer not null references product, 
                   cost money not null );

insert into sale(product_id, cost)
values (1, '100'::money), (1, '101'::money), (1, '102'::money),
       (2, '3'::money), (2, '3'::money), (2, '3'::money),
       (3, '24'::money), (3, '25'::money);

答案 2 :(得分:1)

你应该在(product_id,description)上定义一个UNIQUE约束,然后你就不必担心一个产品有两个描述。

答案 3 :(得分:1)

这是我对Oracle的实现 - 不幸的是,我认为每种基类型都需要一个实现:

create type SoloNumberImpl as object
(
  val number, 
  flag char(1), 
  static function ODCIAggregateInitialize(sctx in out SoloNumberImpl) 
         return number,
  member function ODCIAggregateIterate( self in out SoloNumberImpl, 
                                        value in number )
         return number,
  member function ODCIAggregateTerminate( self in SoloNumberImpl, 
                                          returnValue out number, 
                                          flags in number ) 
         return number,
  member function ODCIAggregateMerge( self in out SoloNumberImpl, 
                                      ctx2 in SoloNumberImpl ) 
         return number
);
/

create or replace type body SoloNumberImpl is 
static function ODCIAggregateInitialize(sctx in out SoloNumberImpl)
       return number is 
begin
  sctx := SoloNumberImpl(null, 'N');
  return ODCIConst.Success;
end;

member function ODCIAggregateIterate( self in out SoloNumberImpl, 
                                      value in number ) 
       return number is
begin
  if self.flag='N' then
    self.val:=value;
    self.flag:='Y';
  else
    if (self.val is null and value is not null) 
         or (self.val is not null and value is null) 
         or (self.val!=value) then
      raise_application_error( -20001, 
                               'This aggregate only allows one unique input' );
    end if;
  end if;
  return ODCIConst.Success;
end;

member function ODCIAggregateTerminate( self in SoloNumberImpl, 
                                        returnValue out number, 
                                        flags in number )  
       return number is
begin
  returnValue := self.val;
  return ODCIConst.Success;
end;

member function ODCIAggregateMerge( self in out SoloNumberImpl, 
                                    ctx2 in SoloNumberImpl ) 
       return number is
begin
  if self.flag='N' then
    self.val:=ctx2.val;
    self.flag=ctx2.flag;
  elsif ctx2.flag='Y' then
    if (self.val is null and ctx2.val is not null) 
          or (self.val is not null and ctx2.val is null) 
          or (self.val!=ctx2.val) then
      raise_application_error( -20001, 
                               'This aggregate only allows one unique input' );
    end if;
  end if;
  return ODCIConst.Success;
end;
end;
/

create function SoloNumber (input number) 
return number aggregate using SoloNumberImpl;
/