I created external schema for my database in aws glue. I can see the list of table but I cannot look into the json data. redshift throws me this errors.
[Amazon](500310) Invalid operation: S3 Query Exception (Fetch)
Details:
-----------------------------------------------
error: S3 Query Exception (Fetch)
code: 15001
context: Task failed due to an internal error. Error occured during Ion/JSON extractor match: IERR_INVALID_SYNTAX
query: 250284
location: dory_util.cpp:717
process: query2_124_250284 [pid=12336]
-----------------------------------------------;
1 statement failed.
I dont want to create external tables because I will create a view combining the external tables in the data catalog in aws glue.
Just an update:
I used aws glue crawler in creating the tables in the data catalog. They are in json format. If I use a job that will upload this data in redshift they are loaded as flat file (except arrays) in redshift table.
Example of json data:
{
"array": [
1,
2,
3
],
"boolean": true,
"null": null,
"number": 123,
"object": {
"a": "b",
"c": "d",
"e": "f"
},
"string": "Hello World"
}
If I upload them using a job in aws glue the output will be like (as table)
Now, I have trmendous amount of tables crawled in data catalog. I am struggling creating the individual script of this tables that is why an amazon redshift spectrum external schema can be helpful.
However when I query the external table in the external schema I am getting the error as posted above. I do not encounter problems with external tables from the data catalog if they are loaded as csv but the format files I need to read in redshift spectrum should be in json.
Is it possible to view the external table in redshift spectrum the same format when it is loaded using a job?