apache pig& piggybank avro联盟类型

时间:2014-03-24 22:52:45

标签: apache-pig avro

我有

的联合类型记录
union {TypeA, TypeB, TypeC, TypeD, TypeE} mydata;

我有avro格式的序列化数据,但是当我尝试使用piggybank.jar的AvroStorage函数加载avro数据时,它会给我以下错误:

Caused by: java.io.IOException: We don't accept schema containing generic unions.
    at org.apache.pig.piggybank.storage.avro.AvroSchema2Pig.convert(AvroSchema2Pig.java:54)
    at org.apache.pig.piggybank.storage.avro.AvroStorage.getSchema(AvroStorage.java:384)
    at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:174)
    ... 23 more

所以,在这里阅读了piggybank源代码后https://github.com/triplel/pig/blob/branch-0.12/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java

    /** determine whether a union is a nullable union;
    * note that this function doesn't check containing
    * types of the input union recursively. */
    public static boolean isAcceptableUnion(Schema in) {
        if (! in.getType().equals(Schema.Type.UNION))
           return false;

    List<Schema> types = in.getTypes();
    if (types.size() <= 1) {
        return true;
    } else if (types.size() > 2) {
        return false; /*contains more than 2 types */
    } else {
        /* one of two types is NULL */
        return types.get(0).getType().equals(Schema.Type.NULL) || types.get(1) .getType().equals(Schema.Type.NULL);
    }
}

基本上皮卡的AvroStorage不支持超过2种联盟类型,我想知道这个决定背后的想法是什么?为什么不让它与Avro兼容?

0 个答案:

没有答案