Kafka jdbc sink connect上的模式异常

时间:2018-01-30 06:22:51

标签: jdbc apache-kafka apache-kafka-connect confluent

我正在尝试使用Kafka jdbc sink connect将行插入到我的Oracle表中。我在我的Kafka主题(JSON)中有消息,如下所示;

[{"f1":"qws","f2":"zcz","f3":"SDFF","f4":"f33bfed577bcd7c4625479bd3cd13323--1132061303","f5":null,"f6":null,"f7":"ghSDAgh/akdjytfd/jhsgd","f8":"hsfgd/sdfjghsfjd/jsg","f9":null,"f10":"ASD","f11":"sdfg/vbnm","f12":"S","startTime":"2018-01-30T05:24:41.162","_startTime":"DATE","f13":219,"f14":"http://192.168.0.1:1234/asd/fgh/jkl/zxc/vbn/qwe/rty","f15":"fe80:0:0:0:7501:14d9:b44b:2a95%eth5","f16":1234,"f17":"ABCD-1234","f18":"192.168.0.1","f19":"sdfgd","dfgVO":{"fa1":null,"fa2":"formats","fa3":""qwe.rty.uiop.asd.fgh.jkl.zxc.vbn.asdf@61e97f29"","fa4":7,"fa5":79,"fa6":null,"fa7":"{}","fa8":1517289881381},"f20":null,"f21":"http-drte-1234-uik-7","f22":false,"f23":false,"f24":false}]

我有如下连接器配置;

name=jdbc-sink-2
connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
tasks.max=1
topics=my_topic_1
connection.url=jdbc:oracle:thin:@192.168.0.1:1521:user01
connection.user=USER1
connection.password=PASSWD1
auto.create=true
table.name.format=MY_TABLE_2
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
producer.retries=1

当我启动连接器时,我收到以下错误;

[2018-01-30 11:16:55,417] ERROR Task jdbc-sink-2 threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:148)
org.apache.kafka.connect.errors.DataException: JsonConverter with schemas.enable requires "schema" and "payload" fields and may not contain additional fields. If you are trying to deserialize plain JSON data, set schemas.enable=false in your converter configuration.
at org.apache.kafka.connect.json.JsonConverter.toConnectData(JsonConverter.java:308)
at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:406)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:250)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:180)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:146)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:190)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[2018-01-30 11:16:55,422] ERROR Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:149)

然后我将以下配置添加到现有的连接器配置中;

key.converter.schemas.enable=false
value.converter.schemas.enable=false

现在,我收到的另一个错误如下;

[2018-01-30 11:36:58,118] ERROR Task jdbc-sink-2 threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerSinkTask:455)
org.apache.kafka.connect.errors.ConnectException: No fields found using key and value schemas for table: MY_TABLE_2
at io.confluent.connect.jdbc.sink.metadata.FieldsMetadata.extract(FieldsMetadata.java:190)
at io.confluent.connect.jdbc.sink.metadata.FieldsMetadata.extract(FieldsMetadata.java:58)
at io.confluent.connect.jdbc.sink.BufferedRecords.add(BufferedRecords.java:65)
at io.confluent.connect.jdbc.sink.JdbcDbWriter.write(JdbcDbWriter.java:62)
at io.confluent.connect.jdbc.sink.JdbcSinkTask.put(JdbcSinkTask.java:66)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:435)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:251)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:180)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:146)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:190)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[2018-01-30 11:36:58,123] ERROR Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerSinkTask:456)
[2018-01-30 11:36:58,124] ERROR Task jdbc-sink-2 threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:148)
org.apache.kafka.connect.errors.ConnectException: Exiting WorkerSinkTask due to unrecoverable exception.
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:457)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:251)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:180)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:146)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:190)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[2018-01-30 11:36:58,125] ERROR Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:149)

这表示我需要像键值模式格式一样修改我的Kafka消息。我无法修改我的Kafka消息格式,因为它是由其他人发布的。我该如何解决这个错误?

谢谢。

1 个答案:

答案 0 :(得分:0)

Per doc,如果要使用JDBC Sink,则需要提供架构。您可以使用Avro + Schema Registry或使用带嵌入式架构的JSON 来执行此操作。您可以看到预期的JSON结构here的示例。

您的数据来自哪里?如果是Kafka Connect源,您可以在启用了模式的情况下使用Avro或JSON。如果它在其他地方,您需要修改它以提供包含模式的数据 - 随模式注册表提供的Avro序列化程序可以为您完成此操作。