创建一个用字典表替换varchar的函数?

时间:2016-12-19 13:55:34

标签: postgresql plpgsql

我正在尝试构建一个函数,我可以做这样的事情:

replace_with_dict(<this_table_to_replace_column, <dictionary_table>, <dictionary_from_field>, <dictionary_to_field>)

字典表将有一个from和to列,该函数将把所有字典中出现的字段替换为字段。

我一直在尝试,但没有获得成功。 这就是我到现在所做的。

CREATE OR REPLACE FUNCTION replace_with_dict( to_replace VARCHAR, 
dict_table regclass, from_field VARCHAR, to_field VARCHAR)
RETURNS VARCHAR AS $$
    DECLARE
        replaced VARCHAR;
        dict_entry RECORD;
        from_replace_pattern VARCHAR;
        to_replace_pattern VARCHAR;
        dictionary CURSOR FOR SELECT from_field AS "in", to_field AS "out" FROM basf_dict;
    BEGIN 
        replaced := to_replace;
--      EXECUTE(format('SELECT %S, %S FROM %S;', from_field, to_field, dict_table)) IN dictionary;
        FOR dic_entry IN dictionary LOOP
            from_replace_pattern := ' ' || dic_entry."in"  || ' ';
            to_replace_pattern   := ' ' || dic_entry."out" || ' ';
            replaced := REPLACE(replaced, from_replace_pattern, to_replace_pattern);
        END LOOP;
        RETURN replaced;
    END;
$$ LANGUAGE plpgsql

当我尝试在这样的查询中运行上述函数时,replace_with_dict(p.nom_produto, "basf_dict", "de", "para"),。我收到了这个错误:

  SQL Error [42703]: ERROR: column "basf_dict" does not exist
  Posição: 86
  org.postgresql.util.PSQLException: ERROR: column "basf_dict" does not exist
  Posição: 86

编辑1:

请注意,变量拼写错误。我已修复它,现在我的函数声明如下:

CREATE OR REPLACE FUNCTION replace_with_dict( to_replace VARCHAR, dict_table VARCHAR, from_field VARCHAR, to_field VARCHAR)
RETURNS VARCHAR AS $$
    DECLARE
        replaced VARCHAR;
        dict_entry RECORD;
        from_replace_pattern VARCHAR;
        to_replace_pattern VARCHAR;
--      dictionary CURSOR FOR SELECT from_field AS d_in, to_field AS d_out FROM basf_dict;
        query text;
    BEGIN 
        query := format('SELECT %I, %I FROM %I;', from_field, to_field, dict_table);
        replaced := to_replace;

        FOR dict_entry IN EXECUTE query LOOP
            from_replace_pattern := ' ' || dic_entry.d_in  || ' ';
            to_replace_pattern   := ' ' || dic_entry.d_out || ' ';
            replaced := REPLACE(replaced, from_replace_pattern, to_replace_pattern);
        END LOOP;
        RETURN replaced;
    END;
$$ LANGUAGE plpgsql

仍然无法正常工作,现在我在尝试运行使用此功能的查询时遇到以下错误:

SQL Error [42P01]: ERROR: missing FROM-clause entry for table "dic_entry"
  Onde: PL/pgSQL function replace_with_dict(character varying,character varying,character varying,character varying) line 14 at assignment
  org.postgresql.util.PSQLException: ERROR: missing FROM-clause entry for table "dic_entry"
  Onde: PL/pgSQL function replace_with_dict(character varying,character varying,character varying,character varying) line 14 at assignment

编辑2:

为了更好地解释我创建函数的动机,以下是我不想做的事情:

SELECT
        p.id,
        p.nom_produto,
        string_ranking_by_array( 
            REPLACE(
                REPLACE(
                    p.nom_produto,
                    'FOS',
                    'FO'),
                'S B', 
                'S_B'), 
            string_to_array( pe.nom_produto, ' ' ) 
        ) AS ranking,
        pe.nom_produto AS nom_pe,
        pe.ean_produto,
        pe.id AS id_pe
    FROM
        produto p, produto_empresa pe
    WHERE 1 = 1
        AND p.id_loja = 23
        AND( p.ean_produto IS NULL OR p.ean_produto = '' )
        AND CHAR_LENGTH( cod_produto )= 12
        AND cod_produto LIKE 'SC%'
    ORDER BY
        ranking DESC,
        p.nom_produto 

我不想为我可能找到的每一项新改进做出更多的内在替代。

2 个答案:

答案 0 :(得分:1)

您的功能有两个问题:

  • 当您尝试使用dict_entry时,循环变量的名称为dic_entry;
  • d_ind_out是未知的,除非您在查询中将它们定义为别名。

此外,您应该从字典表中选择仅匹配输入字符串的单独字的行以减少循环次数。也可以使用regexp_replace()代替replace()来仅替换整个单词(使用空格的尝试将无法正常工作)。转义\m\M表示单词的开头和结尾,请参阅the documentation

CREATE OR REPLACE FUNCTION replace_with_dict
    (to_replace VARCHAR, dict_table VARCHAR, from_field VARCHAR, to_field VARCHAR)
RETURNS VARCHAR AS $$
    DECLARE
        dict_entry RECORD;
        query text;
        pattern text;
        words text[];
    BEGIN 
        words := string_to_array(to_replace, ' ');
        query := format(
            'SELECT %I AS d_in, %I AS d_out FROM %I WHERE %I = ANY(%L);',
            from_field, to_field, dict_table, from_field, words
            );
        FOR dict_entry IN EXECUTE query LOOP
            pattern := format('\m%s\M', dict_entry.d_in);
            to_replace := regexp_replace(to_replace, pattern, dict_entry.d_out, 'g');
        END LOOP;
        RETURN to_replace;
    END;
$$ LANGUAGE plpgsql;

See this working example.

如果您不关心整个单词并且想要替换任何子字符串(可能包含空格),请使用简单的replace()而不需要额外的空格:

CREATE OR REPLACE FUNCTION replace_with_dict_simple
    (to_replace VARCHAR, dict_table VARCHAR, from_field VARCHAR, to_field VARCHAR)
RETURNS VARCHAR AS $$
    DECLARE
        dict_entry RECORD;
        query text;
    BEGIN 
        query := format(
            'SELECT %I AS d_in, %I AS d_out FROM %I;',
            from_field, to_field, dict_table, from_field
            );
        FOR dict_entry IN EXECUTE query LOOP
            to_replace := replace(to_replace, dict_entry.d_in, dict_entry.d_out);
        END LOOP;
        RETURN to_replace;
    END;
$$ LANGUAGE plpgsql;

答案 1 :(得分:0)

像这样:

t=# create table so6(d_from text,d_to text);
CREATE TABLE
t=# insert into so6 select 'street', 'calle';
INSERT 0 1
t=# CREATE OR REPLACE FUNCTION replace_with_dict( to_replace VARCHAR, dict_table regclass, from_field VARCHAR, to_field VARCHAR)
t-# RETURNS VARCHAR AS $$
t$#     DECLARE
t$#         _r text;
t$#         _l record;
t$#     BEGIN
t$#       _r = to_replace;
t$#       for _l in (select unnest(string_to_array(to_replace, ' ')) w) loop
t$#         execute (format('SELECT coalesce(replace($s$%s$s$,%I,%I),$s$%s$s$) FROM %I WHERE %I = $v$%s$v$;', to_replace,from_field, to_field, _r,dict_table, from_field,_l.w)) into _r;
t$#         if _r is not null then
t$#           to_replace = _r;
t$#         end if;
t$#       end loop;
t$#       return to_replace;
t$#     END;
t$# $$ LANGUAGE plpgsql
t-# ;
CREATE FUNCTION
Time: 7.404 ms
t=# select replace_with_dict('go to street "LA Palma"','so6'::regclass,'d_from','d_to')
t-# ;
   replace_with_dict
------------------------
 go to calle "LA Palma"
(1 row)

Time: 0.938 ms

但我认为你应该考虑一下你是否真的想做这样的事情