Kryo中的自动类注册

时间:2012-09-05 08:15:55

标签: java kryo

根据我的理解,Kryo会在每个className中创建一个numberID< - > writeObject地图。这张地图太窄了。因为在您的对象模型中,实例倾向于属于同一个类,所以下一个writeObject将再次构建并序列化类似的映射(并且再次,再次,并再次)。我知道可以通过手动注册类来共享地图,但这是繁琐的手动硬编码。我希望映射将由第一个对象写入启动,正如它通常所做的那样,但会话中的所有后续写入都将重用并扩展它。这样,注册将在运行时自动发生,没有额外的运行时开销,更常用的对象自然会收到低的id号。该地图可以作为解密密钥随后单独地附加在附件中。反序列化器将从加载此映射开始。您如何看待这个想法,以及如何实施?

我的问题与此类似 Strategy for registering classes with kryo用户可以使用List在单个writeObject下组合所有写入。如我所说,它比单独存储地图更简单。但是,似乎他不希望这样做。在我的情况下,这样的组合甚至不可能,因为大型java模型我避免通过序列化将它完全保留在内存中。在我的场景中,用户打开一个项目,进行更改并刷新它们。因此,项目可以维护类的映射并将其用于所有序列化。

更新!我意识到有类/对象注册表和autoReset。它们似乎是为这项任务创造的。但是,我不知道这些事情是如何解决的。 Autoreset=false确实使第二次写得更小。但是,在这种情况下,我无法反序列化对象。正如您在示例中看到的,第二次反序列化失败:

public class A {
    String f1;
    A(String a) {
        f1 = a;
    }
    List list = new ArrayList();
    public String toString() {
        return "f1 = " + f1 + ":" + f1.getClass().getSimpleName();
    }

    public static void main(String[] args) {
        test(true);
        test(false);
    }


    static void write(String time, Kryo kryo, ByteArrayOutputStream baos, Object o) {
        Output output = new Output(baos); 
        kryo.writeClassAndObject(output, o); 
        output.close();
        System.err.println(baos.size() + " after " + time + " write");
    }

    private static void test(boolean autoReset) {
        Kryo kryo = new Kryo();
        kryo.setAutoReset(autoReset);
        kryo.setInstantiatorStrategy(new StdInstantiatorStrategy());
        System.err.println("-------\ntesting autoreset = " + autoReset);
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        A a = new A("a"), b = new A("b");
        write("first", kryo, baos, a);
        write("second", kryo, baos, b);
        A o1 = restore("first", baos, kryo);
        A o2 = restore("second", baos, kryo); // this fails
        System.err.println((o1.f1.equals(o2.f1)) ? "SUCCESS" : "FAILURE");

    }

    private static A restore(String time, ByteArrayOutputStream baos, Kryo k) {
        ByteArrayInputStream in = new ByteArrayInputStream(baos.toByteArray());
        Input input = new Input(in);
        A o = (A) k.readClassAndObject(input);
        System.err.println("reading object " + time + " time, got " + o);
        return o;
    }

输出

-------
testing autoreset = true
41 after first write
82 after second write
reading object first time, got f1 = a:String
reading object second time, got f1 = a:String
SUCCESS
-------
testing autoreset = false
41 after first write
52 after second write
reading object first time, got f1 = a:String
reading object second time, got null
Exception in thread "main" java.lang.NullPointerException
    at kryo_test.AutoresetDemo.test(AutoresetDemo.java:40)
    at kryo_test.AutoresetDemo.main(AutoresetDemo.java:18)

Update2 除了类名之外,autoReset = false还可能会记录对象引用。确实值得自动修改。

Update3 我发现很难序列化类的映射(即,类 - >注册),因为注册包含引用kryo对象并保持某些状态的序列化器。那么很难在许多kryo对象之间共享地图。

1 个答案:

答案 0 :(得分:2)

好的,这是kryo-2.20的解决方案

public class GlobalClassKryo extends Kryo {

    public static class ExternalizableClassResolver implements ClassResolver {

        //local serializers
        final Map<Class, Registration> fromClass = new HashMap();
        final Map<Integer, Registration> fromId = new HashMap();

        public static class GlobalRegistration { int id; Class type; Class<? extends Serializer> serializer; }

        public final Map<Integer, GlobalRegistration> globalIds; 
        public final Map<Class, GlobalRegistration> globalClasses; 

        // I synchronize because I have one reader and one writer thread and 
        // writer may break the reader when adds something into the map. 
        public ExternalizableClassResolver() {this (
                Collections.synchronizedMap(new HashMap()), 
                Collections.synchronizedMap(new HashMap())
            ) ;}

        public ExternalizableClassResolver(Map<Integer, GlobalRegistration> ids, Map<Class, GlobalRegistration> classes) {
            globalIds = ids;
            globalClasses = classes;
        }

        public ExternalizableClassResolver (DataInput in) throws ClassNotFoundException, IOException {
            this();
            int id;
            while ((id = in.readInt()) != -1) {
                GlobalRegistration e = new GlobalRegistration();
                globalIds.put(e.id = id, e);
                e.type = Class.forName(in.readUTF());
                e.serializer = (Class<? extends Serializer>) Class.forName(in.readUTF());
                globalClasses.put(e.type, e);
            }
        }

        public void save(DataOutput out) throws IOException {
            for (GlobalRegistration entry : globalIds.values()) {
                    out.writeInt(entry.id);
                    out.writeUTF(entry.type.getName());
                    out.writeUTF(entry.serializer.getName());
            }
            out.writeInt(-1);
        }

        static final boolean TRACE = false;
        void log(String msg) {
            System.err.println(kryo != null ? Utils.fill(kryo.getDepth(), ' ')  + msg : msg);
        }
        @Override
        public Registration writeClass(Output output, Class type) {
            if (type == null) {output.writeInt(0, true); return null;}
            Registration registration = kryo.getRegistration(type);
            output.writeInt(registration.getId(), true);
            return registration;
        }
        @Override
        public Registration readClass(Input input) {
            int classID = input.readInt(true);
            if (classID == 0) return null;
            Registration registration = fromId.get(classID);
            if (registration == null) {
                registration = tryGetFromGlobal(globalIds.get(classID), classID + "");
            }
            if (registration == null) throw new KryoException("Encountered unregistered class ID: " + classID);
            return registration;
        }

        public Registration register(Registration registration) {
            throw new KryoException("register(registration) is not allowed. Use register(type, serializer)");
        }

        public Registration getRegistration(int classID) {
            throw new KryoException("getRegistration(id) is not implemented");
        }

        Registration tryGetFromGlobal(GlobalRegistration globalClass, String title) {
            if (globalClass != null) {
                Serializer serializer = kryo.newSerializer(globalClass.serializer, globalClass.type);
                Registration registration = register(globalClass.type, serializer, globalClass.id, "local");
                if (TRACE) log("getRegistration(" + title + ") taken from global => " + registration);
                return registration;
            } else
                if (TRACE) log("getRegistration(" + title + ") was not found");
            return null;
        }
        public Registration getRegistration(Class type) {
            Registration registration = fromClass.get(type);
            if (registration == null) {
                registration = tryGetFromGlobal(globalClasses.get(type), type.getSimpleName());
            } else
                if (TRACE) log("getRegistration(" + type.getSimpleName() + ") => " + registration);

            return registration;
        }

        Registration register(Class type, Serializer serializer, int id, String title) {
            Registration registration = new Registration(type, serializer, id);
            fromClass.put(type, registration);
            fromId.put(id, registration);

            if (TRACE) log("new " + title + " registration, " + registration);

            //why dont' we put into fromId?
            if (registration.getType().isPrimitive()) fromClass.put(getWrapperClass(registration.getType()), registration);
            return registration;
        }

        int primitiveCounter = 1; // 0 is reserved for NULL
        static final int PRIMITIVE_MAX = 20;

        //here we register anything that is missing in the global map. 
        // It must not be the case that something available is registered for the second time, particularly because we do not check this here
        // and use registered map size as identity counter. Normally, check is done prior to callig this method, in getRegistration
        public Registration register(Class type, Serializer serializer) {

            if (type.isPrimitive() || type.equals(String.class))
                return register(type, serializer, primitiveCounter++, "primitive");

            GlobalRegistration global = globalClasses.get(type);

            if (global != null )
                    throw new RuntimeException("register(type,serializer): we have " + type + " in the global map, this method must not be called");

            global = new GlobalRegistration();
            globalIds.put(global.id = globalClasses.size() + PRIMITIVE_MAX, global); 
            globalClasses.put(global.type = type, global);
            global.serializer= serializer.getClass();

            return register(global.type, serializer, global.id, "global");
        }

        public Registration registerImplicit(Class type) {
            throw new RuntimeException("registerImplicit is not needed since we register missing automanically in getRegistration");
        }

        @Override
        public void reset() {
            // super.reset(); //no need to reset the classes
        }

        Kryo kryo;
        public void setKryo(Kryo kryo) {
            this.kryo = kryo;
        }
    }




    public ExternalizableClassResolver ourClassResolver() {
        return (ExternalizableClassResolver) classResolver;
    }

    public GlobalClassKryo(ClassResolver resolver) {
        super(resolver, new MapReferenceResolver()); 
        setInstantiatorStrategy(new StdInstantiatorStrategy());
        this.setRegistrationRequired(true);
    }
    public GlobalClassKryo() {
        this(new ExternalizableClassResolver());
    }

    @Override
    public Registration getRegistration (Class type) {
        if (type == null) throw new IllegalArgumentException("type cannot be null.");

        if (type == memoizedClass) return memoizedClassValue;
        Registration registration = classResolver.getRegistration(type);
        if (registration == null) {
            if (Proxy.isProxyClass(type)) {
                // If a Proxy class, treat it like an InvocationHandler because the concrete class for a proxy is generated.
                registration = getRegistration(InvocationHandler.class);
            } else if (!type.isEnum() && Enum.class.isAssignableFrom(type)) {
                // This handles an enum value that is an inner class. Eg: enum A {b{}};
                registration = getRegistration(type.getEnclosingClass());
            } else if (EnumSet.class.isAssignableFrom(type)) {
                registration = classResolver.getRegistration(EnumSet.class);
            }
            if (registration == null) {
                //registration = classResolver.registerImplicit(type);
                return register(type, getDefaultSerializer(type));
            }
        }
        memoizedClass = type;
        memoizedClassValue = registration;
        return registration;
    }

    public Registration register(Class type, Serializer serializer) {
        return ourClassResolver().register(type, serializer);}

    public Registration register(Registration registration) {
        throw new RuntimeException("only register(Class, Serializer) is allowed");}

    public Registration register(Class type) {
        throw new RuntimeException("only register(Class, Serializer) is allowed");}

    public Registration register(Class type, int id) {
        throw new RuntimeException("only register(Class, Serializer) is allowed");}

    public Registration register(Class type, Serializer serializer, int id) {
        throw new RuntimeException("only register(Class, Serializer) is allowed");
    }

    static void write(String title, Kryo k, ByteArrayOutputStream baos, Object obj) {
        Output output = new Output(baos);
        k.writeClassAndObject(output, obj);
        output.close();
        System.err.println(baos.size() + " bytes after " + title + " write");
    }
    static class A {
        String field = "abcABC";
        A a = this;
        //int b = 1; // adds 1 byte to serialization
        @Override
        public String toString() {
            return field 
                    + " " + list.size() 
                    //+ ", " + b
                    ;
        }

        // list adds 3 bytes to serialization, two 3-byte string items add additionally 10 bytes in total
        ArrayList list = new ArrayList(100); // capacity is trimmed in serialization
        {
            list.add("LLL");
            list.add("TTT");

        }
    }

    private static void test() throws IOException, ClassNotFoundException {
        GlobalClassKryo k = new GlobalClassKryo();

        ByteArrayOutputStream baos = new ByteArrayOutputStream();

        write("first", k, baos, new A()); // write takes 24 byts

        //externalize the map
        ByteArrayOutputStream mapOut = new ByteArrayOutputStream();
        DataOutputStream dataOut = new DataOutputStream(mapOut);
        k.ourClassResolver().save(dataOut);
        dataOut.close();

        //deserizalize the map
        DataInputStream serialized = new DataInputStream(new ByteArrayInputStream(mapOut.toByteArray()));
        ExternalizableClassResolver resolver2 = new ExternalizableClassResolver(serialized);

        //use the map
        k = new GlobalClassKryo(resolver2);
        write("second", k, baos, new A()); // 24 bytes

        Input input = new Input(new ByteArrayInputStream(baos.toByteArray()));
        Object read = k.readClassAndObject(input);
        System.err.println("output " + read);
    }

    public static void main(String[] args) throws IOException, ClassNotFoundException {
        Kryo k = new Kryo();
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        write("first", k, baos, new A()); // write takes 78 bytes
        write("second", k, baos, new A()); // +78 bytes
        System.err.println("----------------");

        test();
    }
}

结果流清除了类名。不幸的是,与默认的java序列化(2x或更多)相比,Kryo变得太慢了,尽管它产生的流更加密集。单独使用Kryo使我的样本序列化减少了近10倍。您会看到此答案中提供的解决方案会增加3倍的额外因素。但是,在我序列化兆字节的字段中,当使用Kryo存储到磁盘时,我在java序列化方面只获得2倍压缩,而在速度方面只增加2倍。