如何在AWS-DMS Target S3端点中将control-A字符用作csvDelimiter?

时间:2019-09-11 02:16:02

标签: csv amazon-s3 delimiter aws-dms

我正在使用AWS DMS将数据从Aurora提取到S3,并且想要使用我选择的csvDelimiter,即^ A(即control-A,八进制表示\ 001),同时将数据加载到S3。我怎么做?。默认情况下,当S3用作DMS的目标时,它使用“,”作为默认定界符

compressionType=NONE;csvDelimiter=,;csvRowDelimiter=\n;

但是我想使用以下内容 compressionType=NONE;csvDelimiter='\001';csvRowDelimiter=\n;

但是它将分隔符作为文本输出到输出中: I'\001'12345'\001'Abc'

我正在使用AWS DMS控制台设置目标端点 我尝试使用以下定界符,但没有用:

\\001 \u0001 '\u0001' \u01 \001

实际结果: I'\001'12345'\001'Abc' 预期结果: I^A12345^AAbc

1 个答案:

答案 0 :(得分:0)

这是我要解决的问题:

我使用aws命令行在目标s3端点中设置此定界符。 https://docs.aws.amazon.com/translate/latest/dg/setup-awscli.html

aws cli命令:

aws dms modify-endpoint --endpoint-arn arn:aws:dms:us-west-2:000001111222:endpoint:OXXXXXXXXXXXXXXXXXXXX4 --endpoint-identifier dms-ep-tgt-s3-abc --endpoint-type target --engine-name s3 --extra-connection-attributes "bucketFolder=data/folderx;bucketname=bkt-xyz;CsvRowDelimiter=^D;CompressionType=NONE;CsvDelimiter=^A;" --service-access-role-arn arn:aws:iam::000001111222:role/XYZ-Datalake-DMS-Role --s3-settings ServiceAccessRoleArn=arn:aws:iam::000001111222:role/XYZ-Datalake-DMS-Role,BucketName=bkt-xyz,CompressionType=NONE

输出:

{
"Endpoint": {
    "Status": "active", 
    "S3Settings": {
        "CompressionType": "NONE", 
        "EnableStatistics": true, 
        "BucketFolder": "data/folderx", 
        "CsvRowDelimiter": "\u0004", 
        "CsvDelimiter": "\u0001", 
        "ServiceAccessRoleArn": "arn:aws:iam::000001111222:role/XYZ-Datalake-DMS-Role", 
        "BucketName": "bkt-xyz"
    }, 
    "EndpointType": "TARGET", 
    "ServiceAccessRoleArn": "arn:aws:iam::000001111222:role/XYZ-Datalake-DMS-Role", 
    "SslMode": "none", 
    "EndpointArn": "arn:aws:dms:us-west-2:000001111222:endpoint:OXXXXXXXXXXXXXXXXXXXX4", 
    "ExtraConnectionAttributes": "bucketFolder=data/folderx;bucketname=bkt-xyz;CompressionType=NONE;CsvDelimiter=\u0001;CsvRowDelimiter=\u0004;", 
    "EngineDisplayName": "Amazon S3", 
    "EngineName": "s3", 
    "EndpointIdentifier": "dms-ep-tgt-s3-abc"
}

}

注意:运行aws cli命令后,DMS控制台将不会在端点中显示定界符(由于它是特殊字符,因此不可见)。但是一旦运行任务,它就会出现在s3文件的数据中。