分析hive上的json数据的问题

时间:2016-02-26 11:49:51

标签: apache-pig

我想分析猪的json数据..这是我有效的json数据:

[{"author":"gjkfhvk","title":"gdfjhsdgfjk","published":1997, "reviews":[{"name":"fvdjk","stars":5},{"name":"dhjk","stars":4}]}, {"author":"ggkjhk","title":"gdfghfjhgh","published":1998,"reviews":[{"name":"jhj‌​k","stars":6},{"name":"fghh","stars":6}]}]

这是我的猪命令:

data = load '/home/user/Desktop/tej/pig.json' using JsonLoader('author:chararray,title:chararray,year:int,reviews:{review:(name:char‌​array,stars:int)}');

当我使用此命令显示数据中的内容时:转储数据...

我输出为:

Input(s):
Successfully read 3 records from: "/home/user/Desktop/pig.json"

Output(s):
Successfully stored 3 records in: "file:/tmp/temp1826337556/tmp244945211"

(,)
(,)
(gdfghfjhgh,{(jhjk),(fghh)})

无法获取标题和评论数据的第一个数组数据..

你能帮帮我..

1 个答案:

答案 0 :(得分:0)

Json无效。我已经纠正过了。见下文

Input

{"author":"gjkfhvk","title":"gdfjhsdgfjk","published":1997,"reviews":[{"name":"fvdjk","stars":5},{"name":"dhjk","stars":4}]} {"author":"ggkjhk","title":"gdfghfjhgh","published":1998,"reviews":[{"name":"jhj??k","stars":6},{"name":"fghh","stars":6}]}

<强>输出

Output