在风暴拓扑

时间:2018-05-04 19:25:43

标签: java scala serialization apache-storm kryo

风暴版本:1.2.1, Java版本:8

我在scala中编写风暴拓扑,并在群集模式下运行时开始出现以下错误。我能够在LocalCluster模式下使用config:conf.put(Config.TOPOLOGY_TESTING_ALWAYS_TRY_SERIALIZE, Boolean.box( true))获得相同的效果。以下是跟踪:

2018-05-05 00:49:59,342 ERROR util [Thread-37-disruptor-executor[6 6]-send-queue] Async loop died!
java.lang.RuntimeException: java.lang.RuntimeException: java.io.NotSerializableException: com.fasterxml.jackson.databind.node.ObjectNode
    at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:522) ~[storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:487) ~[storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:74) ~[storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.disruptor$consume_loop_STAR_$fn__4492.invoke(disruptor.clj:84) ~[storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.util$async_loop$fn__557.invoke(util.clj:484) [storm-core-1.2.1.jar:1.2.1]
    at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
Caused by: java.lang.RuntimeException: java.io.NotSerializableException: com.fasterxml.jackson.databind.node.ObjectNode
    at org.apache.storm.serialization.SerializableSerializer.write(SerializableSerializer.java:41) ~[storm-core-1.2.1.jar:1.2.1]
    at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) ~[kryo-3.0.3.jar:?]
    at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100) ~[kryo-3.0.3.jar:?]
    at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40) ~[kryo-3.0.3.jar:?]
    at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:534) ~[kryo-3.0.3.jar:?]
    at org.apache.storm.serialization.KryoValuesSerializer.serializeInto(KryoValuesSerializer.java:44) ~[storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.serialization.KryoTupleSerializer.serialize(KryoTupleSerializer.java:44) ~[storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.daemon.worker$assert_can_serialize.invoke(worker.clj:133) ~[storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.daemon.worker$mk_transfer_fn$fn__5204.invoke(worker.clj:213) ~[storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.daemon.executor$start_batch_transfer__GT_worker_handler_BANG_$fn__4882.invoke(executor.clj:314) ~[storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.disruptor$clojure_handler$reify__4475.onEvent(disruptor.clj:41) ~[storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:509) ~[storm-core-1.2.1.jar:1.2.1]
    ... 6 more
Caused by: java.io.NotSerializableException: com.fasterxml.jackson.databind.node.ObjectNode
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184) ~[?:1.8.0_131]
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) ~[?:1.8.0_131]
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509) ~[?:1.8.0_131]
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) ~[?:1.8.0_131]
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178) ~[?:1.8.0_131]
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) ~[?:1.8.0_131]
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509) ~[?:1.8.0_131]
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) ~[?:1.8.0_131]
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178) ~[?:1.8.0_131]
    at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348) ~[?:1.8.0_131]
    at org.apache.storm.serialization.SerializableSerializer.write(SerializableSerializer.java:38) ~[storm-core-1.2.1.jar:1.2.1]
    at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) ~[kryo-3.0.3.jar:?]
    at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100) ~[kryo-3.0.3.jar:?]
    at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40) ~[kryo-3.0.3.jar:?]
    at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:534) ~[kryo-3.0.3.jar:?]
    at org.apache.storm.serialization.KryoValuesSerializer.serializeInto(KryoValuesSerializer.java:44) ~[storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.serialization.KryoTupleSerializer.serialize(KryoTupleSerializer.java:44) ~[storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.daemon.worker$assert_can_serialize.invoke(worker.clj:133) ~[storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.daemon.worker$mk_transfer_fn$fn__5204.invoke(worker.clj:213) ~[storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.daemon.executor$start_batch_transfer__GT_worker_handler_BANG_$fn__4882.invoke(executor.clj:314) ~[storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.disruptor$clojure_handler$reify__4475.onEvent(disruptor.clj:41) ~[storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:509) ~[storm-core-1.2.1.jar:1.2.1]
    ... 6 more

因为似乎风暴正在尝试序列化ObjectNode,这是无法做到并给予NotSerializableException

ObjectNode serializable不应该serializable?我看到关于此here的旧讨论,但觉得这应该是conf.registerSerialization(classOf[com.fasterxml.jackson.databind.node.ObjectNode])

我尝试在风暴配置中添加以下内容,但没有帮助。

conf.setSkipMissingKryoRegistrations(false)

我也尝试添加 SELECT a.Id, a.Name, a.Department, t.RepeatNameCount FROM Email a INNER JOIN ( select name, COUNT(*) as RepeatNameCount from Email group by name ) t = t.name = a.name ,但再次没有救援。

有什么可以解决这个问题?

2 个答案:

答案 0 :(得分:0)

ObjectNode不可序列化(它没有实现Serializable接口)。

conf.setSkipMissingKryoRegistrations(false)是默认设置。请参阅https://storm.apache.org/releases/2.0.0-SNAPSHOT/Serialization.html,其中描述了此属性的功能。我认为你不想在你的情况下改变它。

conf.registerSerialization(ObjectNode.class);添加到拓扑配置应该有效,不确定为什么它不适合您。如果你无法使它工作,你可以通过序列化到例如它来解决它。在发出值之前映射或字符串。

答案 1 :(得分:0)

从@ Stig的answeranswer中获取灵感,每当在螺栓之间传递它而不是我的对象时,我都会将对象序列化。所以现在我在我的螺栓中发送这样的字节数组:

val messages = input.asInstanceOf[TupleImpl].get("Request").asInstanceOf[Array[Byte]].getObj[List[myObject]]
val objMapper = new ObjectMapper()
messages.foreach(message => collector.emit(new Values(objMapper.writeValueAsBytes(message))))

编辑1:

修复此问题的另一种可能方法似乎是(未尝试,我通过发送字节解决)是为您正在按照here所述从一个螺栓传递到另一个螺栓的对象编写一个序列化器类。以下是此链接中的示例序列化程序:

public class StockAvroSerializer extends Serializer<Stock> {

    private static final Logger LOG = LoggerFactory.getLogger(StockAvroSerializer.class);
    private Schema SCHEMA = Stock.getClassSchema();

    public void write(Kryo kryo, Output output, Stock object) {
        DatumWriter<Stock> writer = new SpecificDatumWriter<>(SCHEMA);
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        BinaryEncoder encoder = EncoderFactory.get().binaryEncoder(out, null);
        try {
            writer.write(object, encoder);
            encoder.flush();
        } catch (IOException e) {
            LOG.error(e.toString(), e);
        }
        IOUtils.closeQuietly(out);
        byte[] outBytes = out.toByteArray();
        output.writeInt(outBytes.length, true);
        output.write(outBytes);
    }

    public Stock read(Kryo kryo, Input input, Class<Stock> type) {
        byte[] value = input.getBuffer();
        SpecificDatumReader<Stock> reader = new SpecificDatumReader<>(SCHEMA);
        Stock record = null;
        try {
            record = reader.read(null, DecoderFactory.get().binaryDecoder(value, null));
        } catch (IOException e) {
            LOG.error(e.toString(), e);
        }
        return record;
    }
}

编辑2:

Here我发现为什么无法序列化ObjectNode:

  

JsonNode不知道如何使用序列化时可用的信息来序列化自己:没有使用ObjectMapper或JsonGenerator;后者是序列化自身必须具有的组件(如果有的话,还有内容)。它不能也应该尝试实例化(它们应该如何配置?);而静态单例往往会在较大的系统中引起问题(一部分试图将它们单向配置,另一部分则不同)

但这是相当陈旧的沟通,在新版本中,我认为应该有一些机制使其可以序列化。