RedShift复制命令返回

时间:2016-08-05 06:43:45

标签: amazon-redshift

我们可以通过复制命令获取插入的行数吗?有些记录可能会失败,那么成功插入的记录是什么?

我在Amazon S3中有一个带有json对象的文件,并尝试使用copy命令将数据加载到Redshift中。我如何知道成功插入了多少条记录以及有多少条记录失败?

1 个答案:

答案 0 :(得分:0)

加载一些示例数据:

db=# copy test from 's3://bucket/data' credentials '' maxerror 5;
INFO:  Load into table 'test' completed, 4 record(s) loaded successfully.
COPY

db=# copy test from 's3://bucket/err_data' credentials '' maxerror 5;
INFO:  Load into table 'test' completed, 1 record(s) loaded successfully.
INFO:  Load into table 'test' completed, 2 record(s) could not be loaded.  Check 'stl_load_errors' system table for details.
COPY

然后是以下查询:

with _successful_loads as (
    select
        stl_load_commits.query
      , listagg(trim(filename), ', ') within group(order by trim(filename)) as filenames
    from stl_load_commits
    left join stl_query using(query)
    left join stl_utilitytext using(xid)
    where rtrim("text") = 'COMMIT'
    group by query
),
_unsuccessful_loads as (
    select
        query
      , count(1) as errors
    from stl_load_errors
    group by query
)
select
    query
  , filenames
  , sum(stl_insert.rows)            as rows_loaded
  , max(_unsuccessful_loads.errors) as rows_not_loaded
from stl_insert
inner join _successful_loads using(query)
left join _unsuccessful_loads using(query)
group by query, filenames
order by query, filenames
;

,并提供:

 query |                   filenames                    | rows_loaded | rows_not_loaded
-------+------------------------------------------------+-------------+-----------------
 45597 | s3://bucket/err_data.json                      |           1 |               2
 45611 | s3://bucket/data1.json, s3://bucket/data2.json |           4 |
(2 rows)