我是猪的新手,我正在尝试使用以下结构解析json
{"id1":197,"id2":[
{"id3":"109.11.11.0","id4":"","id5":1391233948301},
{"id3":"10.10.15.81","id4":"","id5":1313393100648},
...
]}
上述文件名为jsonfile.txt
alias = load 'jsonfile.txt' using JsonLoader('id1:int,id2:[id3:chararray,id4:chararray,id5:chararray]');
这是我得到的错误。
错误org.apache.pig.tools.grunt.Grunt - 错误1200:错误的输入'id3'期待RIGHT_BRACKET
你知道我可能做错了吗?
答案 0 :(得分:1)
您的JSON架构格式不正确。
复杂数据类型的格式如下所示:
Tuple: enclosed by (), items separated by "," Non-empty tuple: (item1,item2,item3) Empty tuple is valid: () Bag: enclosed by {}, tuples separated by "," Non-empty bag: {code}{(tuple1),(tuple2),(tuple3)}{code} Empty bag is valid: {} Map: enclosed by [], items separated by ",", key and value separated by "#" Non-empty map: [key1#value1,key2#value2] Empty map is valid: []
来源:http://pig.apache.org/docs/r0.10.0/func.html#jsonloadstore
换句话说,[]不是数组,它们是关联表(地图),其中关键字符是“#”来分割键和值。尝试使用元组(括号)。
'id1:int,id2:(id3:chararray,id4:chararray,id5:chararray)'
OR
'id1:int,id2:{(id3:chararray,id4:chararray,id5:chararray)}'
我无法测试它并且从未尝试过Pig,但根据文档,它应该可以正常工作。
(基于以下示例)
a = load 'a.json' using JsonLoader('a0:int,a1:{(a10:int,a11:chararray)},a2:(a20:double,a21:bytearray),a3:[chararray]');