将日期数据从S3(ORC)加载到Redshift

时间:2020-01-18 01:55:12

标签: amazon-s3 amazon-redshift orc

S3数据(ORC格式):

orc-tools数据data.orc

{"date_id":"2020-01-15","counter":3}
{"date_id":"2020-01-15","counter":56}

ORC模式:

Type: struct<date_id:string,counter:bigint>

Redshift架构:

CREATE TABLE IF NOT EXISTS  mytable (
  date_id                                       DATE ENCODE zstd,
  counter                                       BIGINT ENCODE zstd
)
COMPOUND SORTKEY(date_id);

当我尝试将数据复制到redshift时,出现以下错误:

Task failed due to an internal error. In file s3://<path> declared column type TIMESTAMP
for column date_id incompatible w

复制命令:

copy db.mytable from 's3://<path>' credentials '<iam_role>' format as orc;

我尝试过,将date_id更改为TIMESTAMP的redshift架构,出现相同的错误。 进一步查看AWS文档,ORC文件的COPY不支持DATEFORMAT或TIMEFORMAT。 https://docs.aws.amazon.com/redshift/latest/dg/copy-usage_notes-copy-from-columnar.html

如何将date数据从s3 ORC格式复制到redshift?

0 个答案:

没有答案