在雅典娜爆炸数组

时间:2018-09-25 13:29:47

标签: amazon-dynamodb amazon-athena presto

我在雅典娜有一个简单的桌子,它有一系列事件。我想编写一个简单的select语句,以便数组中的每个事件都变成一行。

我尝试爆炸,转化,但没有运气。我已经在Spark和Hive中成功做到了。但是这个雅典娜在骗我。请指教

DROP TABLE bi_data_lake.royalty_v4;
CREATE external TABLE bi_data_lake.royalty_v4 (
   KAFKA_ID string,
   KAFKA_TS string,
   deviceUser struct< deviceName:string, devicePlatform:string >,
   consumeReportingEvents array<
                                struct<
                                        consumeEvent: string,
                                        consumeEventAction: string,
                                        entryDateTime: string
                                      >
                               >
   )
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://XXXXXXXXXXX';

查询不起作用

select kafka_id, kafka_ts,deviceuser, 
transform( consumereportingevents, consumereportingevent -> consumereportingevent.consumeevent) as cre
from bi_data_lake.royalty_v4 
where kafka_id = 'events-consumption-0-490565';
  

不支持   侧面爆炸(consumereportingevents)作为consumereportingevent

回答问题以使用unnset

找到了我的问题的答案

WITH samples AS (
 select kafka_id, kafka_ts,deviceuser, consumereportingevent, consumereportingeventPos
 from bi_data_lake.royalty_v4 
 cross join unnest(consumereportingevents)  WITH ORDINALITY AS T (consumereportingevent, consumereportingeventPos)
 where kafka_id = 'events-consumption-0-490565' or kafka_id = 'events-consumption-0-490566'
)
SELECT * FROM samples

1 个答案:

答案 0 :(得分:-1)

在 AWS Athena 中使用 UNNEST 展平(“爆炸”)嵌套数组。

WITH dataset AS (
  SELECT
    'engineering' as department,
    ARRAY['Sharon', 'John', 'Bob', 'Sally'] as users
)
SELECT department, names FROM dataset
CROSS JOIN UNNEST(users) as t(names)

参考:Flattening Nested Arrays