Athena / Presto数据发现查询可推荐JSON模式?

时间:2018-11-03 03:09:59

标签: amazon-athena presto

我有一个只有一列(raw)的Athena表(json)。

我有以下查询,它输出json键的频率:

SELECT key, count(*)
FROM (
  SELECT map_keys(cast(json_parse(json) AS map(varchar, json))) AS keys
  FROM raw
)
CROSS JOIN UNNEST (keys) AS t (key)
GROUP BY key

如何扩展此查询,以便告诉我特定键是否具有带有任何非数字字符的值?

[我在找到答案后删除了失败的尝试]

1 个答案:

答案 0 :(得分:0)

这有效:

SELECT k, count(*) as isPresent, sum(isNumber) as isNumber, 
          count(*)-sum(isNumber) as notIsNumber from (
               with dataset as (SELECT 
                   cast(json_parse(json) AS map(varchar, varchar)) as kv FROM raw)
               SELECT t.k, t.v, 
                      IF(TRY(cast(t.v as double)) is null, 0, 1) as isNumber 
               from dataset cross join unnest(kv) as t(k, v)
) group by k