PostgreSQL表大吗?内存不足?

时间:2016-10-27 20:52:08

标签: postgresql error-handling

我有一张约有380万行的表。当我查询整个表格时,我得到了

  

错误:值溢出数字格式

引用用户定义函数返回的值。 但如果我将表格大致分成两半(见下文),一切正常。

SELECT day,item,price,
    CAST(my_func(price) OVER (PARTITION BY item ORDER BY day) AS numeric(8,2)),
    FROM my_table
    --WHERE day < '3/1/2013';
    --WHERE day >= '3/1/2013';

具有WHERE子句的语句在没有错误的情况下执行。

价格为numeric(8,2),价格列中没有任何数字大于numeric(8,2)。无论如何,将格式更改为numeric(20,2)没有任何区别。

以下是表格定义:

    CREATE TABLE my_table
    (
        item    character(5)    NOT NULL,
        day     date            NOT NULL,
        price   numeric(8,2),
        CONSTRAINT  my_table_pk PRIMARY KEY (item, day)
    );

...和功能:

    CREATE OR REPLACE FUNCTION my_func2 (avg numeric, IN price numeric)
    RETURNS numeric AS $$
    DECLARE
        alpha numeric;
    BEGIN
        alpha := 2.0/51;
        RETURN
            CASE
                WHEN avg IS NULL THEN price  -- avg is NULL for the first row, so price is returned
                ELSE round((alpha * price + (1-alpha) * avg),2)
            END;
    END;
    $$ LANGUAGE PLPgSQL;

...用于聚合:

    CREATE AGGREGATE my_func(numeric) (SFUNC = my_func2, STYPE = numeric);

1 个答案:

答案 0 :(得分:1)

错误发生在您的施法操作中。格式numeric(8,2)非常严格,可能my_func()会返回不满足格式定义的值。要证明这一点,请查看以下查询:

select 12.34::numeric(8,2);
 numeric 
---------
   12.34

select 12.345678::numeric(8,2);
 numeric 
---------
   12.35

select 12.3456789::numeric(8,2);
 numeric 
---------
   12.35

select 123456.123456789::numeric(8,2);
  numeric  
-----------
 123456.12

select 1234567.123456789::numeric(8,2);
ERROR:  numeric field overflow
DETAIL:  A field with precision 8, scale 2 must round to an absolute value less than 10^6.

select 1234567.8::numeric(8,2);
ERROR:  numeric field overflow
DETAIL:  A field with precision 8, scale 2 must round to an absolute value less than 10^6.

如果您注意到,返回的号码的总数字不会超过8个数字,并且始终具有2个十进制数字。最后两个查询会出错,因为它们应返回超过8个数字。例如,您希望将数字1234567.123456789舍入为1234567.12,但1234567.129数字组成,而不是8。数字1234567.8也是如此,即使您有8个数字也是如此。这是因为在返回的数值中你需要2十进制数字,所以postgres sohuld输出1234567.80但是再次,这里你有9个数字而不是{{ 1}}。

换句话说,您有不同的方法来解决此问题:

  1. 使用8 my_func()个数字来增加numeric(16,2)所需的总小数位数(选择您想要的数字)。
  2. 使用其他数字格式,例如16numeric。例如:real
  3. 如果您需要特定的小数位数和无限的总位数,请尝试(my_func(price) OVER (PARTITION BY item ORDER BY day))::real。否则,请修改round(my_func(price) OVER (PARTITION BY item ORDER BY day), 2)以返回my_func()
  4. 为了帮助您了解和/或找到导致错误的原因,请考虑这一点。对于评估round(returned_value, 2)的至少一个或一行,您会在左侧获得一个数字超过my_func()的数字。要查找生成错误的行,您只需执行以下查询:

    6

    此查询返回的行会生成强制转换错误。显然,如果您没有在WITH not_casted AS ( SELECT day,item,price, my_func(price) OVER (PARTITION BY item ORDER BY day) AS fprice FROM my_table ) SELECT * FROM not_casted WHERE fprice > 999999.99 内部numeric(8,2)进行类型转换,则此方法有效,否则会对您进行类型转换的值生成错误。在不知道功能代码的情况下,不可能做出其他假设。

    更新

    我提出了一个基于模拟的例子。代码执行以下操作:   - 创建具有不同类型转换和舍入方法的不同my_func()   - 在充当您的数据的模拟随机样本上执行每个AGGREGATE(希望如此)。它每天生成AGGREGATE个价格,每个价格都有10item项,超过10天。这对于证明精度损失并不重要,所以如果没有正确模拟数据,请不要责备我:)

    以下是创建四个函数和聚合的代码:

    31

    现在,代码模拟数据并应用三个函数:

    -- typecast price and arithmetics to numeric(8,2)
    CREATE OR REPLACE FUNCTION my_func_numeric_8_2_a (avg numeric(8,2), IN price numeric(8,2))
    RETURNS numeric(8,2) AS $$
    DECLARE
        alpha numeric;
    BEGIN
        alpha := 2.0/51;
        RETURN
            CASE
                WHEN avg IS NULL THEN price
                ELSE (alpha * price + (1-alpha) * avg)::numeric(8,2)
            END;
        END;
    $$ LANGUAGE PLPgSQL;
    CREATE AGGREGATE my_func_numeric_8_2(numeric(8,2)) (SFUNC = my_func_numeric_8_2_a, STYPE = numeric(8,2));
    
    
    -- typecast price and arithmetics to numeric and round(arithmetics, 2)
    CREATE OR REPLACE FUNCTION my_func_numeric_round_a(avg numeric, IN price numeric)
    RETURNS numeric AS $$
    DECLARE
        alpha numeric;
    BEGIN
        alpha := 2.0/51;
        RETURN
        CASE
            WHEN avg IS NULL THEN price
                ELSE round((alpha * price + (1-alpha) * avg), 2)
            END;
        END;
    $$ LANGUAGE PLPgSQL;
    CREATE AGGREGATE my_func_numeric_round(numeric) (SFUNC = my_func_numeric_round_a, STYPE = numeric);
    
    -- no typecast (double precision type)
    CREATE OR REPLACE FUNCTION my_func_dp_a(avg double precision, IN price double precision)
    RETURNS double precision AS $$
    DECLARE
        alpha double precision;
    BEGIN
        alpha := 2.0/51;
        RETURN
        CASE
            WHEN avg IS NULL THEN price
                ELSE (alpha * price + (1-alpha) * avg)
            END;
        END;
    $$ LANGUAGE PLPgSQL;
    CREATE AGGREGATE my_func_dp(double precision) (SFUNC = my_func_dp_a, STYPE = double precision);
    
    -- typecast price and arithmetics to numeric
    CREATE OR REPLACE FUNCTION my_func_numeric_a(avg numeric, IN price numeric)
    RETURNS numeric AS $$
    DECLARE
        alpha numeric;
    BEGIN
        alpha := 2.0/51;
        RETURN
        CASE
            WHEN avg IS NULL THEN price
                ELSE (alpha * price + (1-alpha) * avg)
            END;
        END;
    $$ LANGUAGE PLPgSQL;
    CREATE AGGREGATE my_func_numeric(numeric) (SFUNC = my_func_numeric_a, STYPE = numeric);
    

    由于WITH sample AS ( SELECT "day", (random())*10 AS price, generate_series(1,10)::text AS item FROM (SELECT generate_series('2000-01-01'::timestamp, '2000-01-31'::timestamp, '1 day'::interval)::date AS "day") AS calendar ) SELECT "day", item, price, -- typecast price and arithmetics to numeric(8,2) my_func_numeric_8_2(price::numeric(8,2)) OVER (PARTITION BY item ORDER BY "day") AS numeric_8_2, -- typecast price and arithmetics to numeric and round(arithmetics, 2) my_func_numeric_round(price::numeric) OVER (PARTITION BY item ORDER BY "day") AS numeric_round, -- typecast price and arithmetics to numeric and round the final result round(my_func_numeric(price::numeric) OVER (PARTITION BY item ORDER BY "day"), 2) AS round_numeric, -- no typecast (double precision type) my_func_dp(price) OVER (PARTITION BY item ORDER BY "day") AS no_typecast, -- typecast price and arithmetics to numeric my_func_numeric(price::numeric) OVER (PARTITION BY item ORDER BY "day") AS numeric FROM sample ORDER BY item, "day" 的使用,每次查询执行都会生成不同的结果。向下滚动结果,即使random()与计算所有四个值相同,您也会看到许多行具有不同的值。此外,通过降低精度损失(或提高精度)对列进行排序:price是四者之间最精确的,而my_func_dp(price)则不太精确,但最精确&#34;精确&#34 ;

    如果您从命令行运行上一个查询,您会注意到my_func_numeric_8_2(price::numeric(8,2))会返回长度增加的数字,因为my_func_numeric(price::numeric)尽可能精确,所以他的固定长度可以变化。如果你从pgAdmin执行它,你将获得一个全长数字的四舍五入数字。

    Screenshot of a portion of the simulated results.