Question

我有一个mysql表

create table orderitems
(
    id char(36) collate utf8_bin not null
        primary key,
    store_id char(36) collate utf8_bin not null,
    ref_type int not null,
    ref_id char(36) collate utf8_bin not null,
    store_product_id char(36) collate utf8_bin not null,
    product_id char(36) collate utf8_bin not null,
    product_name varchar(50) null,
    main_image varchar(200) null,
    price int not null,
    count int not null,
    is_gift tinyint(1) not null,
);

我的同事使用sqoop将这个表转储到s3路径下的多个Parquet文件中。我需要将这些文件加载到redshift中。

第一

我在mysql表DDL上面尝试过，发现 collate，utf8_bin，null，not null和tinyint不支持红移。所以我用下面的方法在redshift中创建订单项表

create table orderitems
(
    id char(36),
    store_id char(36),
    ref_type int,
    ref_id char(36),
    store_product_id char(36),
    product_id char(36),
    product_name varchar(50),
    main_image varchar(200),
    price int,
    count int,
    is_gift SMALLINT,
);

然后

通过导入数据：

COPY orderitems from 's3://xxxx/arch/M/orderitems/' CREDENTIALS 'aws_access_key_id=xxx;aws_secret_access_key=xxx'

但是出错了

[XX000][500310] [Amazon](500310) Invalid operation: Load into table 'orderitems' failed.  Check 'stl_load_errors' system table for details.;

检查STL_LOAD_ERRORS表并找到

1216    Missing newline: Unexpected character 0x15 found at location 4

我搜索了周围，但没有找到。谁能告诉我如何解决这个问题？

Answer 1

您将复制命令的格式部分保留为默认值。

每Redshift documentation：

默认情况下，COPY命令期望源数据位于字符分隔的UTF-8文本文件。默认的分隔符是管道字符（|）。

以字符分隔的文件使用换行符来区分每条记录，这些记录解释了您收到的错误消息。

由于Parquet是具有自己规则的独特格式，因此您必须向Redshift提供有关您尝试加载的文件类型的更多信息。

IE。

COPY orderitems from 's3://xxxx/arch/M/orderitems/'
CREDENTIALS 'aws_access_key_id=xxx;aws_secret_access_key=xxx' 
FORMAT AS PARQUET;

将拼花地板复制到Redshift时出现错误1216缺少换行符：在位置4找到意外的字符0x15

第一

然后

1 个答案: