汇编和Cassandra:获取DataException:无法将数据反序列化为Avro,未知魔术字节

时间:2016-09-28 23:10:09

标签: cassandra apache-kafka avro flume confluent

我遵循http://www.confluent.io/blog/kafka-connect-cassandra-sink-the-perfect-match/的教程,我可以将数据从avro控制台插入到cassandra。 现在我试图扩展这个以使用水槽,我在我的机器中设置了水槽,它将选择日志文件并将其推送到kafka,尝试将我的数据插入到cassandra数据库。 在文本文件中我放数据

{“id”:1,“created”:“2016-05-06 13:53:00”,“product”:“OP-DAX-P-20150201-95.7”,“price”:94.2} < / p>

{“id”:2,“created”:“2016-05-06 13:54:00”,“product”:“OP-DAX-C-20150201-100”,“price”:99.5} < / p>

{“id”:3,“created”:“2016-05-06 13:55:00”,“product”:“FU-DATAMOUNTAINEER-20150201-100”,“price”:10000}

{“id”:4,“created”:“2016-05-06 13:56:00”,“product”:“FU-KOSPI-C-20150201-100”,“price”:150} < / p>

Flume正在挑选这些数据并将其推送到kafka。

在cassandra接收器中,我遇到了错误,

错误任务cassandra-sink-orders-0引发了一个未被捕获且不可恢复的异常(org.apache.kafka.connect.runtime.WorkerTask:142)         org.apache.kafka.connect.errors.DataException:无法将数据反序列化为Avro:             at io.confluent.connect.avro.AvroConverter.toConnectData(AvroConverter.java:109)             at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:346)             在org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:226)             在org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:170)             在org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:142)             在org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:140)             在org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:175)             at java.util.concurrent.Executors $ RunnableAdapter.call(Executors.java:511)             at java.util.concurrent.FutureTask.run(FutureTask.java:266)             在java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)             at java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:617)             在java.lang.Thread.run(Thread.java:745)         引起:org.apache.kafka.common.errors.SerializationException:错误反序列化id为-1的Avro消息         引起:org.apache.kafka.common.errors.SerializationException:未知的魔术字节!         [2016-09-28 15:47:00,951] ERROR任务正在被杀死,直到手动重启才会恢复(org.apache.kafka.connect.runtime.WorkerTask:143)         [2016-09-28 15:47:00,951]信息阻止卡桑德拉下沉。 (com.datamountaineer.streamreactor.connect.cassandra.sink.CassandraSinkTask:79)         [2016-09-28 15:47:00,952] INFO关闭Cassandra驱动程序会话和集群。 (com.datamountaineer.streamreactor.connect.cassandra.sink.CassandraJsonWriter:165)

我正在使用的架构

 ./confluent/bin/kafka-avro-console-producer \--broker-list localhost:9092 \--topic orders-topic \--property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"id","type":"int"}, {"name":"created", "type": "string"}, {"name":"product", "type": "string"}, {"name":"price", "type": "double"}]}'

配置水槽: Flume-kafka.conf.properties

agent.sources = spoolDirSrc
agent.channels = memoryChannel
agent.sinks = kafkaSink


agent.sources.spoolDirSrc.type = spooldir
agent.sources.spoolDirSrc.spoolDir = eventlogs
agent.sources.spoolDirSrc.inputCharset = UTF-8
agent.sources.spoolDirSrc.deserializer.maxLineLength = 1048576

agent.sources.spoolDirSrc.channels = memoryChannel
agent.sinks.kafkaSink.channel = memoryChannel
agent.channels.memoryChannel.type = memory

agent.channels.memoryChannel.capacity = 1000

 agent.sinks.kafkaSink.type = org.apache.flume.sink.kafka.KafkaSink
 agent.sinks.kafkaSink.topic = orders-topic
 agent.sinks.kafkaSink.brokerList = localhost:9092
 agent.sinks.kafkaSink.channel = memoryChannel
 agent.sinks.kafkaSink.batchSize = 20

任何人都可以帮助我,如何解决此错误?

1 个答案:

答案 0 :(得分:0)

通常,如果您有一个未知的魔术字节,则表示Kafka的客户端和服务器版本不兼容。检查以确保您的Cassandra接收器版本是使用小于或等于您的代理的Kafka客户端库构建的。