Question

在将数据从S3加载到Redshift时，是否有任何方法/选项或解决方法来跳过包含错误条目的整个文件。请注意，我不是在谈论跳过文件中无效的条目，而是跳过包含错误条目或记录的整个文件。

Answer 1

默认情况下，如果您未在“复制”命令中提供Maxerror选项，则Redshift将使整个文件失败。其默认行为。

(byte) data[0]

以上命令将使整个文件失败，并且将不会从给定文件中加载任何数据。阅读文档here，了解更多信息。

如果指定copy catdemo from 's3://awssampledbuswest2/tickit/category_pipe.txt' iam_role 'arn:aws:iam::<aws-account-id>:role/<role-name>' region 'us-west-2';选项，则只有它会忽略特定文件中直至该＃的记录。

Maxerror

在上面的示例中，copy catdemo from 's3://awssampledbuswest2/tickit/category_pipe.txt' iam_role 'arn:aws:iam::<aws-account-id>:role/<role-name>' region 'us-west-2' MAXERROR 500;最多可以容忍Redshift个不良记录。

我希望这能回答您的问题，但如果没有，请更新问题，我将重新集中答案。

Redshift跳过包含错误的整个文件

1 个答案: