有什么办法可以在雪花中进行错误处理?

时间:2020-03-16 16:05:42

标签: stored-procedures snowflake-cloud-data-platform

Am目前正在将数据从一个雪花表加载到雪花中的另一张表,同时也在进行数据加载时进行了一些数据类型转换

但是如果发生任何错误,我的加载就会失败。我需要捕获表中的错误行,并在发生任何错误的情况下继续加载。

我已经尝试过使用以下存储过程,但只能捕获错误信息:- 请告知我是否有办法在雪花中实现这一目标。

CREATE OR REPLACE PROCEDURE LOAD_TABLE_A() 
RETURNS varchar 
NOT NULL 
LANGUAGE javascript 
AS 
$$
var result;
var sql_command = "insert into TABLE A"
 sql_command += " select"
 sql_command += " migration_status,to_date(status_date,'ddmmyyyy') as status_date,"
 sql_command += " to_time(status_time,'HH24MISS') as status_time,unique_unit_of_migration_number,reason,"
 sql_command += " to_timestamp_ntz(current_timestamp) as insert_date_time"
 sql_command += " from TABLE B"
 sql_command += " where insert_date_time>(select max(insert_date_time) from TABLE A);"
try {
    snowflake.execute({ sqlText: sql_command});
    result = "Succeeded";
} 
catch (err) {
    result = "Failed";
    snowflake.execute({
      sqlText: `insert into mcs_error_log VALUES (?,?,?,?)`
      ,binds: [err.code, err.state, err.message, err.stackTraceTxt]
      });
}
return result;
$$;

4 个答案:

答案 0 :(得分:1)

我研究了一个示例,该示例如何将一个表中的好行发送到另一个表中,而将坏行发送到另一个表中。它应该很快就会在Sno​​wflake博客上。关键是使用多表插入,如下所示:

-- Create a staging table with all columns defined as strings.
-- This will hold all raw values from the load filess.
create or replace table SALES_RAW
(                                       -- Actual Data Type
  SALE_TIMESTAMP            string,     -- timestamp
  ITEM_SKU                  string,     -- int
  PRICE                     string,     -- number(10,2)
  IS_TAXABLE                string,     -- boolean
  COMMENTS                  string      -- string
);

-- Create the production table with actual data types.
create or replace table SALES_STAGE
(
  SALE_TIMESTAMP            timestamp,
  ITEM_SKU                  int,
  PRICE                     number(10,2),
  IS_TAXABLE                boolean,
  COMMENTS                  string
);

-- Simulate adding some rows from a load file. Two rows are good.
-- Four rows generate errors when converting to the data types.
insert into SALES_RAW 
    (SALE_TIMESTAMP, ITEM_SKU, PRICE, IS_TAXABLE, COMMENTS) 
    values
    ('2020-03-17 18:21:34', '23289', '3.42',   'TRUE',  'Good row.'),
    ('2020-17-03 18:21:56', '91832', '1.41',   'FALSE', 'Bad row: SALE_TIMESTAMP has the month and day transposed.'),
    ('2020-03-17 18:22:03', '7O242', '2.99',   'T',     'Bad row: ITEM_SKU has a capital "O" instead of a zero.'),
    ('2020-03-17 18:22:10', '53921', '$6.25',  'F',     'Bad row: PRICE should not have a dollar sign.'),
    ('2020-03-17 18:22:17', '90210', '2.49',   'Foo',   'Bad row: IS_TAXABLE cannot be converted to true or false'),
    ('2020-03-17 18:22:24', '80386', '1.89',   '1',     'Good row.');

-- Make sure the rows inserted okay.
select * from SALES_RAW;

-- Create a table to hold the bad rows.
create or replace table SALES_BAD_ROWS like SALES_RAW;

-- Insert good rows into SALES_STAGE and
-- bad rows into SALES_BAD_ROWS
insert first
  when  SALE_TIMESTAMP_X is null and SALE_TIMESTAMP is not null or
        ITEM_SKU_X       is null and SALE_TIMESTAMP is not null or
        PRICE_X          is null and PRICE          is not null or
        IS_TAXABLE_X     is null and IS_TAXABLE     is not null
  then
        into SALES_BAD_ROWS
            (SALE_TIMESTAMP, ITEM_SKU, PRICE, IS_TAXABLE, COMMENTS)
        values
            (SALE_TIMESTAMP, ITEM_SKU, PRICE, IS_TAXABLE, COMMENTS)  
  else
        into SALES_STAGE 
            (SALE_TIMESTAMP, ITEM_SKU, PRICE, IS_TAXABLE, COMMENTS) 
         values
            (SALE_TIMESTAMP_X, ITEM_SKU_X, PRICE_X, IS_TAXABLE_X, COMMENTS)
select  try_to_timestamp (SALE_TIMESTAMP)   as SALE_TIMESTAMP_X,
        try_to_number    (ITEM_SKU, 10, 0)  as ITEM_SKU_X,
        try_to_number    (PRICE, 10, 2)     as PRICE_X,
        try_to_boolean   (IS_TAXABLE)       as IS_TAXABLE_X,
                                               COMMENTS, 
                                               SALE_TIMESTAMP,
                                               ITEM_SKU,
                                               PRICE,
                                               IS_TAXABLE
from    SALES_RAW;

-- Examine the two good rows
select * from SALES_STAGE;

-- Examine the four bad rows
select * from SALES_BAD_ROWS;

答案 1 :(得分:0)

Snowflake捕获了加载错误信息,可以通过查询COPY_HISTORY表函数来访问该信息。

https://docs.snowflake.net/manuals/sql-reference/functions/copy_history.html

在COPY INTO命令中,您可以使用ON_ERROR参数来决定如果一行或多行未能通过加载过程继续处理文件。

https://docs.snowflake.net/manuals/sql-reference/sql/copy-into-table.html#copy-options-copyoptions

答案 2 :(得分:0)

我建议您签出try_cast

https://docs.snowflake.net/manuals/sql-reference/functions/try_cast.html

对于您的查询,我也只使用一个视图,如果性能是物化视图的问题。

答案 3 :(得分:0)

我认为一个不错的解决方案是使用helper方法包装您的SQL调用。

例如,让我们说而不是做雪花。execute({})...

您使用类似的内容:

EXEC(select * from table1 where x > ?,[param1]);

在EXEC方法中,您可以尝试捕获,并且可以轻松地添加诸如continue处理程序之类的内容,或者可以将逻辑将错误记录在表中的情况下,可以添加exit_handler。

我用工具和一些代码片段组装了一个仓库。也许看看:https://github.com/orellabac/SnowJS-Helpers