AWS Glue-ETL-将SQL数据转换为JSON数组

时间:2019-05-22 09:17:13

标签: amazon-web-services etl aws-glue

我正在尝试使用AWS Glue服务从RDS实例读取数据并获取数据并将其存储在JSON文件的S3存储桶中,以便可以在其他地方使用该数据。

我能够建立与数据库的连接,并且搜寻器能够填充表。

tags = glueContext.create_dynamic_frame.from_catalog(database = "search-rds", table_name = "glue_search_db_tag")
instances = glueContext.create_dynamic_frame.from_catalog(database = "search-rds", table_name = "glue_search_db_instance")
classes = glueContext.create_dynamic_frame.from_catalog(database = "search-rds", table_name = "glue_search_db_class")

tagdetails = Join.apply(classes, Join.apply(tags, instances, 'parent', 'noderef'), 'noderef', 'class')


glueContext.write_dynamic_frame.from_options(frame = tagdetails, connection_type = "s3", connection_options = {"path": destinationbucket},format = "json")

使用上述代码,我可以将数据写入S3存储桶,但是文件格式不是我想要的。

{"full_name":"TAG1_FULLNAME","parent":12340,"context":0,"noderef":12340}
{"full_name":"TAG2_FULLNAME","parent":12340,"context":0,"noderef":12341}
{"full_name":"TAG3_FULLNAME","parent":12340,"context":0,"noderef":12342}
{"full_name":"TAG4_FULLNAME","parent":12340,"context":0,"noderef":12343}

我想要的是这样的

[{"full_name":"TAG1_FULLNAME","parent":12340,"context":0,"noderef":12340},
{"full_name":"TAG2_FULLNAME","parent":12340,"context":0,"noderef":12341},
{"full_name":"TAG3_FULLNAME","parent":12340,"context":0,"noderef":12342},
{"full_name":"TAG4_FULLNAME","parent":12340,"context":0,"noderef":12343}]

反正我能做到吗?任何建议将不胜感激,或者如果我错过了这里,请告诉我

0 个答案:

没有答案