我们可以通过复制命令获取插入的行数吗?有些记录可能会失败,那么成功插入的记录是什么?
我在Amazon S3中有一个带有json对象的文件,并尝试使用copy命令将数据加载到Redshift中。我如何知道成功插入了多少条记录以及有多少条记录失败?
答案 0 :(得分:0)
加载一些示例数据:
db=# copy test from 's3://bucket/data' credentials '' maxerror 5;
INFO: Load into table 'test' completed, 4 record(s) loaded successfully.
COPY
db=# copy test from 's3://bucket/err_data' credentials '' maxerror 5;
INFO: Load into table 'test' completed, 1 record(s) loaded successfully.
INFO: Load into table 'test' completed, 2 record(s) could not be loaded. Check 'stl_load_errors' system table for details.
COPY
然后是以下查询:
with _successful_loads as (
select
stl_load_commits.query
, listagg(trim(filename), ', ') within group(order by trim(filename)) as filenames
from stl_load_commits
left join stl_query using(query)
left join stl_utilitytext using(xid)
where rtrim("text") = 'COMMIT'
group by query
),
_unsuccessful_loads as (
select
query
, count(1) as errors
from stl_load_errors
group by query
)
select
query
, filenames
, sum(stl_insert.rows) as rows_loaded
, max(_unsuccessful_loads.errors) as rows_not_loaded
from stl_insert
inner join _successful_loads using(query)
left join _unsuccessful_loads using(query)
group by query, filenames
order by query, filenames
;
,并提供:
query | filenames | rows_loaded | rows_not_loaded
-------+------------------------------------------------+-------------+-----------------
45597 | s3://bucket/err_data.json | 1 | 2
45611 | s3://bucket/data1.json, s3://bucket/data2.json | 4 |
(2 rows)