嗨,我是Storm和Kafka的新手。 我使用的是风暴1.0.1和卡夫卡0.10.0 我们有一个kafkaspout,可以从kafka主题接收java bean。 我花了几个小时来寻找合适的方法。 找到了一些有用的文章,但迄今为止没有一种方法适合我。
以下是我的代码:
StormTopology:
public class StreamKafkaProducer {
private static Producer producer;
private final Properties props = new Properties();
private static final StreamKafkaProducer KAFKA_PRODUCER = new StreamKafkaProducer();
private StreamKafkaProducer(){
props.put("bootstrap.servers", "localhost:9092");
props.put("acks", "all");
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "com.abc.serializer.MySerializer");
producer = new org.apache.kafka.clients.producer.KafkaProducer(props);
}
public static StreamKafkaProducer getStreamKafkaProducer(){
return KAFKA_PRODUCER;
}
public void produce(String topic, VehicleTrip vehicleTrip){
ProducerRecord<String,VehicleTrip> producerRecord = new ProducerRecord<>(topic,vehicleTrip);
producer.send(producerRecord);
//producer.close();
}
public static void closeProducer(){
producer.close();
}
}
我使用kryo
在kafka上序列化数据KafkaProducer:
public class DataKyroSerializer extends Serializer<Data> implements Serializable {
@Override
public void write(Kryo kryo, Output output, VehicleTrip vehicleTrip) {
output.writeLong(data.getStartedOn().getTime());
output.writeLong(data.getEndedOn().getTime());
}
@Override
public Data read(Kryo kryo, Input input, Class<VehicleTrip> aClass) {
Data data = new Data();
data.setStartedOn(new Date(input.readLong()));
data.setEndedOn(new Date(input.readLong()));
return data;
}
}
Kyro Serializer:
public class KryoScheme implements Scheme {
private ThreadLocal<Kryo> kryos = new ThreadLocal<Kryo>() {
protected Kryo initialValue() {
Kryo kryo = new Kryo();
kryo.addDefaultSerializer(Data.class, new DataKyroSerializer());
return kryo;
};
};
@Override
public List<Object> deserialize(ByteBuffer ser) {
return Utils.tuple(kryos.get().readObject(new ByteBufferInput(ser.array()), Data.class));
}
@Override
public Fields getOutputFields( ) {
return new Fields( "data" );
}
}
我需要将数据恢复到Data bean。
根据一些文章,我需要提供一个自定义方案,并使其成为拓扑的一部分,但到现在为止我没有运气
螺栓和方案代码
方案:
public class AnalysisBolt implements IBasicBolt {
/**
*
*/
private static final long serialVersionUID = 1L;
private String topicname = null;
public AnalysisBolt(String topicname) {
this.topicname = topicname;
}
public void prepare(Map stormConf, TopologyContext topologyContext) {
System.out.println("prepare");
}
public void execute(Tuple input, BasicOutputCollector collector) {
System.out.println("execute");
Fields fields = input.getFields();
try {
JSONObject eventJson = (JSONObject) JSONSerializer.toJSON((String) input
.getValueByField(fields.get(1)));
String StartTime = (String) eventJson.get("startedOn");
String EndTime = (String) eventJson.get("endedOn");
String Oid = (String) eventJson.get("_id");
int V_id = (Integer) eventJson.get("vehicleId");
//call method getEventForVehicleWithinTime(Long vehicleId, Date startTime, Date endTime)
System.out.println("==========="+Oid+"| "+V_id+"| "+StartTime+"| "+EndTime);
} catch (Exception e) {
e.printStackTrace();
}
}
和bolt:
java.lang.IllegalStateException: Spout 'kafkaspout' contains a
non-serializable field of type com.abc.topology.KryoScheme$1, which
was instantiated prior to topology creation.
com.minda.iconnect.topology.KryoScheme$1 should be instantiated within
the prepare method of 'kafkaspout at the earliest.
但如果我提交风暴拓扑,我会收到错误:
... some html
<--- where i want to put our lovely div
<div> // no tags no info just a simple div
<div id="myId">
... some html
</div>
</div>
... some html
感谢帮助调试问题并指导正确的道路。
由于
答案 0 :(得分:1)
您的ThreadLocal不可序列化。最好的解决方案是使序列化程序既可序列化又可线程化。如果这是不可能的,那么我会看到2个替代方案,因为没有准备方法,因为你会得到一个螺栓。
答案 1 :(得分:0)
在Storm生命周期中,拓扑被实例化,然后序列化为字节格式,以便在执行拓扑之前存储在ZooKeeper中。在此步骤中,如果拓扑中的spout或bolt具有已初始化的不可序列化属性,则序列化将失败。
如果需要一个不可序列化的字段,请在bolt或spout的prepare方法中初始化它,该方法在拓扑传递给worker之后运行。