我有以下AWS Athena create table语句:
CREATE EXTERNAL TABLE IF NOT EXISTS s2cs3dataset.s2c_storage (
`MessageHeader` string,
`TimeToProcess` float,
`KeyCreated` string,
`KeyLastTouch` string,
`CreatedDateTime` string,
`TableReference` array<struct<`BusinessObject`: string,
`TransactionType`: string,
`ReferenceKeyId`: float,
`ReferencePrimaryKey`: string,
`IncludedTables`: array<string>>>,
`SAPStoreReference` string
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1' ) LOCATION 's3://api-dev-dpstorage-s3/S2C_INPUT/storage/' TBLPROPERTIES ('has_encrypted_data'='false');
由此,我要在此查询中选择以下项目:
SELECT MessageHeader,
TimeToProcess,
KeyCreated,
KeyLastTouch,
CreatedDateTime,
tr.BusinessObject,
tr.TransactionType,
tr.ReferencePrimaryKey,
it.IncludedTables,
SAPStoreReference
FROM s2c_storage
cross join UNNEST(s2c_storage.tablereference) as p(tr)
cross join UNNEST(tr.IncludedTables) as p(it)
但是我遇到以下错误:
SYNTAX_ERROR:第9:1行:表达式“ it”的类型不是ROW
如果我删除底部交叉连接和引用它的列,则查询工作正常,因此尝试将struct数组中的字符串数组的JSON数据解包时,我做错了。有小费吗?
答案 0 :(得分:0)
根据澄清的注释,tr.IncludedTables
的类型为array(varchar)
。
因此,在查询... CROSS JOIN UNNEST(tr.IncludedTables) AS p(it)
中,it
的类型为varchar
。在select子句中,您可以将此值称为it
(或提供别名:it as IncludedTables
),但是不能使用it.IncludedTables
(varchar
值)引用它。没有“字段”,因此特别是它没有IncludedTables
字段。