HIVE,如何从数组中获取元素,元素本身也是一个数组

时间:2016-09-29 23:11:27

标签: json oracle hadoop hive

我有一个数据库表,其中有一列存储JSON格式字符串。字符串本身包含多个元素,如数组。每个元素包含多个键值对。某些值也可能包含多个键值对,例如,"地址"属性如下。

People table:
  Col1      Col2   .....   info
  aaa       bbb           see below

对于列" info",它包含以下JSON格式字符串:

 [{"name":"abc", 
  "address":{"street":"str1", "city":"c1"},
  "phone":"1234567"
 },
 {"name":"def", 
  "address":{"street":"str2", "city":"c1", "county":"ct"},
  "phone":"7145895"
 }
]

我需要获取JSON字符串中每个字段的单个值。除了"地址"我能够为所有字段做到这一点。通过调用explode()来调用字段,如下所示:

 SELECT  
   get_json_object(person, '$.name') AS name,
   get_json_object(person, '$.phone') AS phone,
   get_json_object(person, '$.address') AS addr
 FROM people lateral view explode(split(regexp_replace(
      regexp_replace(info, '\\}\\,\\{', '\\}\\\\n\\{' ), '\\[|\\]',''), '\\\\n')) 
      p as person;

我的问题是我如何在"地址"领域。 "地址"字段可以包含任意数量的键值对,我不能使用JSONSerDe。我想使用另一个explode()调用,但我不能让它工作。有人可以请帮助。非常感谢。

1 个答案:

答案 0 :(得分:1)

您可以使用

直接调用json_objects
SELECT  
  get_json_object(person, '$.name') AS name,
  get_json_object(person, '$.phone') AS phone,
  get_json_object(person, '$.address.street') AS street,
  get_json_object(person, '$.address.city') AS city,
  get_json_object(person, '$.address.county') AS county,      
FROM people lateral view explode(split(regexp_replace(
  regexp_replace(info, '\\}\\,\\{', '\\}\\\\n\\{' ), '\\[|\\]',''), '\\\\n')) 
  p as person;