我在我的文件https://www.sitepoint.com/google-maps-json-file/中使用JSON作为数组。我使用JSON serde将数据导入到表中,因为它具有类似于strucure的数组,我们不能将JSON_TUPPLE和JSON_OBJECT UDF与数组一起使用,否则它给出空值。
我们不能在具有JSOn数据的HQL查询中使用where子句吗?因为每当我查询表时它都会提供完整的JSON数据,它不会过滤
`hive> select * from complex_json where markers[1].point="4578"
OK
[{"point":"1233","hometeam":"Lawrence Library","awayteam":"LUGip","markerimage":"images/red.png","information":"Linux users group meets second Wednesday of each month.","fixture":"Wednesday 7pm","capacity":"","previousscore":""},
{"point":"4578","hometeam":"Hamilton Library","awayteam":"LUGip HW SIG","markerimage":"images/white.png","information":"Linux users can meet the first Tuesday of the month to work out harward and configuration issues.","fixture":"Tuesday 7pm","capacity":"","previousscore":null}]
Time taken: 0.304 seconds, Fetched: 1 row(s)`
答案 0 :(得分:1)
表中的每条记录都是一个数组,因此当select *
子句找到匹配项时,where
会将整个记录作为输出。
由于您的where
子句适用于markers[1]
,因此您可以使用
select markers[1] from complex_json where markers[1].point="4578"
这将只获取数组中所需的JSON。
答案 1 :(得分:0)
如果未在select查询中指定列,则Hive将提取原始Json。正如富兰克林刚刚解释的那样。