Avro架构如下所示:
{
"type" : "record",
"name" : "name1",
"fields" :
[
{
"name" : "f1",
"type" : "string"
},
{
"name" : "f2",
"type" :
{
"type" : "array",
"items" :
{
"type" : "record",
"name" : "name2",
"fields" :
[
{
"name" : "time",
"type" : [ "float", "int", "double", "long" ]
},
]
}
}
}
]
}
在Pig阅读之后:
grunt> A = load 'data' using AvroStorage();
grunt> DESCRIBE A;
A: {f1: chararray,f2: {ARRAY_ELEM: (time: (FLOAT: float,INT: int,DOUBLE: double,LONG: long))}}
我想要的可能是一包(f1:chararray, timestamp:double)
。这就是我所做的:
grunt> B = FOREACH A GENERATE f1, f2.time AS timestamp;
grunt> DESCRIBE B;
B: {f1: chararray,timestamp: {(time: (FLOAT: float,INT: int,DOUBLE: double,LONG: long))}}
那么我该如何压扁这个记录?
我是Pig,Avro的新手,并且不知道我想要做什么甚至是有道理的。谢谢你的帮助。