如何生成与使用avro控制台生产者完全相同的Kafka avro记录?

时间:2018-01-10 20:50:40

标签: java apache-kafka avro kafka-producer-api confluent-schema-registry

我正在使用Confluent 3.3.0。我的目的是使用kafka-connect将Kafka主题中的值插入到Oracle表中。我的连接器使用avro控制台生成器制作的avro记录工作正常,如下所示:

./kafka-avro-console-producer --broker-list 192.168.0.1:9092 --topic topic6 --property value.schema='{"type":"record","name":"flights3","fields":[{"name":"flight_id","type":"string"},{"name":"flight_to", "type": "string"}, {"name":"flight_from", "type": "string"}]}'

我插入的值如下:

{"flight_id":"1","flight_to":"QWE","flight_from":"RTY"}

我想要实现的是使用Java应用程序使用对象插入相同的数据。以下是我的制作人代码:

public class Sender {
    public static void main(String[] args) {
        Properties props = new Properties();
        props.put("bootstrap.servers", "192.168.0.1:9092");
        props.put("acks", "all");
        props.put("retries", 0);
        props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        props.put("value.serializer", "serializers.custom.FlightSerializer");
        props.put("schema.registry.url", "http://192.168.0.1:8081");
        Producer<String, Flight> producer = new KafkaProducer<String, Flight>(props);
        Flight myflight = new Flight("testflight1","QWE","RTY");
        ProducerRecord<String, Flight> record = new ProducerRecord<String, Flight>("topic5","key",myflight);

        try {
            producer.send(record).get();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

以下是Flight VO:

package vo;

public class Flight {
    String flight_id,flight_to,flight_from;

    public Flight(String flight_id, String flight_to, String flight_from) {
        this.flight_id = flight_id;
        this.flight_to = flight_to;
        this.flight_from = flight_from;
    }

    public Flight(){
    }

    public String getFlight_id() {
        return flight_id;
    }

    public void setFlight_id(String flight_id) {
        this.flight_id = flight_id;
    }

    public String getFlight_to() {
        return flight_to;
    }

    public void setFlight_to(String flight_to) {
        this.flight_to = flight_to;
    }

    public String getFlight_from() {
        return flight_from;
    }

    public void setFlight_from(String flight_from) {
        this.flight_from = flight_from;
    }
}

最后,串行器:

package serializers.custom;

import java.util.Map;
import org.apache.kafka.common.serialization.Serializer;
import vo.Flight;
import com.fasterxml.jackson.databind.ObjectMapper;

public class FlightSerializer implements Serializer<Flight> {
    @Override
    public void close() {
    }

    @Override
    public void configure(Map<String, ?> arg0, boolean arg1) {
    }

    @Override
    public byte[] serialize(String arg0, Flight arg1) {
        byte[] retVal = null;
        ObjectMapper objectMapper = new ObjectMapper();

        try {
            retVal = objectMapper.writeValueAsString(arg1).getBytes();
        } catch (Exception e) {
            e.printStackTrace();
        }

        return retVal;
    }
}

但我理解的是,需要定义类似架构,并使用一些avro序列化程序来获取确切数据,就像我使用 avro控制台消费者一样。我已经阅读了一些示例代码,但没有一个适用于我。

修改

我尝试了以下代码。但是avro控制台消费者什么都没有。

package producer.serialized.avro;

import org.apache.avro.Schema;
import org.apache.avro.generic.GenericData;
import org.apache.avro.generic.GenericRecord;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ProducerRecord;
import vo.Flight;
import java.util.Properties;

public class Sender {
public static void main(String[] args) {
String flightSchema = "{\"type\":\"record\"," + "\"name\":\"flights\","
+ "\"fields\":[{\"name\":\"flight_id\",\"type\":\"string\"},{\"name\":\"flight_to\",\"type\":\"string\"},{\"name\":\"flight_from\",\"type\":\"string\"}]}";
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "192.168.0.1:9092");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,io.confluent.kafka.serializers.KafkaAvroSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,io.confluent.kafka.serializers.KafkaAvroSerializer.class);
props.put("schema.registry.url", "http://192.168.0.1:8081");
KafkaProducer producer = new KafkaProducer(props);
Schema.Parser parser = new Schema.Parser();
Schema schema = parser.parse(flightSchema);
GenericRecord avroRecord = new GenericData.Record(schema);
avroRecord.put("flight_id", "1");
avroRecord.put("flight_to", "QWE");
avroRecord.put("flight_from", "RTY");
ProducerRecord<String, GenericRecord> record = new ProducerRecord<>("topic6", avroRecord);

try {
producer.send(record);
} catch (Exception e) {
e.printStackTrace();
}
}
}

2 个答案:

答案 0 :(得分:1)

未定义架构,因此当KafkaAvroSerializer必须联系架构注册表以提交架构时,它将不具备该架构。

您必须为对象Flight

创建架构

下面的file.avdl(avro扩展名文件之一)的一个例子很好:

@namespace("vo")
protocol FlightSender {

    record Flight {
       union{null, string} flight_id = null;
       union{null, string} flight_to = null;
       union{null, string} flight_from = null;
    }
}

请参阅Avro IDL docs

At compile time, when you use the avro-maven-plugin,上面的avro架构将生成您的java Flight类,因此您必须删除之前创建的那个。

当涉及到您的主类时,您必须设置如下两个属性:

props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,io.confluent.kafka.serializers.KafkaAvroSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,io.confluent.kafka.serializers.KafkaAvroSerializer.class); 

您的制作人,您可以使用您生成的特定Avro类

Producer<String, Flight> producer = new KafkaProducer<String, Flight>(props);

希望有所帮助: - )

答案 1 :(得分:0)

  

精确的数据,就像我使用avro控制台使用者

您可以take a peek at the source code for that


假设您要使用通用记录,这都是正确的,

Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "192.168.0.1:9092");
...
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,io.confluent.kafka.serializers.KafkaAvroSerializer.class);
props.put("schema.registry.url", "http://192.168.0.1:8081");

Producer<String, GenericRecord> producer = new KafkaProducer<>(props);

...

GenericRecord avroRecord = new GenericData.Record(schema);
avroRecord.put("flight_id", "1");
avroRecord.put("flight_to", "QWE");
avroRecord.put("flight_from", "RTY");
ProducerRecord<String, GenericRecord> record = new ProducerRecord<>("topic6", avroRecord);

try {
    producer.send(record);
} catch (Exception e) {
    e.printStackTrace();
}

但是您最终却错过了对producer.flush()producer.close()的呼叫,而实际上并未发送该批记录