无法运行JDBC sink将数据从Kafka移动到MS SQL Server

时间:2018-03-02 10:11:00

标签: apache-kafka confluent-kafka

我已经将一些JSON格式的数据生成为kafka主题,我正在尝试运行JDBC接收器以将此数据发送到Microsoft SQL Server。

这是我的sink.properties文件:

name=sink-msql
connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
tasks.max=1
topics=mytopicname
connection.url=jdbc:sqlserver://127.0.0.1:1111;DatabaseName=Test;user=username;password=pass
auto.create=true

这是我的工作人员配置文件:

bootstrap.servers=localhost:9092

key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=false
value.converter.schemas.enable=false

internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false

offset.storage.file.filename=/tmp/connect.offsets
rest.port=9998

plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
plugin.path=/usr/share/java

当我尝试运行接收器时

./bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties etc/kafka-connect-jdbc/sink.properties

我收到以下错误:

org.apache.kafka.connect.errors.ConnectException: No fields found using key and value schemas for table: FCUBS1203.STTM_CUSTOMER
        at io.confluent.connect.jdbc.sink.metadata.FieldsMetadata.extract(FieldsMetadata.java:127)
        at io.confluent.connect.jdbc.sink.metadata.FieldsMetadata.extract(FieldsMetadata.java:64)
        at io.confluent.connect.jdbc.sink.BufferedRecords.add(BufferedRecords.java:71)
        at io.confluent.connect.jdbc.sink.JdbcDbWriter.write(JdbcDbWriter.java:66)
        at io.confluent.connect.jdbc.sink.JdbcSinkTask.put(JdbcSinkTask.java:69)
        at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:495)
        at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:288)
        at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:198)
        at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:166)
        at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170)
        at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

基于this -possible- solution我还试图修改我的工作人员配置文件并设置

key.converter.schemas.enable=true
value.converter.schemas.enable=true

但现在报告了一个不同的错误:

ERROR WorkerSinkTask{id=sink-oracle-msql-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:172)
org.apache.kafka.connect.errors.DataException: JsonConverter with schemas.enable requires "schema" and "payload" fields and may not contain additional fields. If you are trying to deserialize plain JSON data, set schemas.enable=false in your converter configuration.
        at org.apache.kafka.connect.json.JsonConverter.toConnectData(JsonConverter.java:308)

        at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:453)
        at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:287)
        at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:198)
        at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:166)
        at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170)
        at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

基本上,第一个错误建议将key.converter.schemas.enablevalue.converter.schemas.enable设置为true,第二个错误建议将其设置为false

任何建议都将不胜感激。

编辑

这是一个JSON条目(请注意,此JSON由第三方复制软件生成)。

{  
   "magic":"atMSG",
   "type":"DT",
   "headers":null,
   "messageSchemaId":null,
   "messageSchema":null,
   "message":{  
      "data":{  
         "CUSTOMER_NO":"123456789",
         "CUSTOMER_NAME":"Giorgos"
      },
      "beforeData":null,
      "headers":{  
         "operation":"REFRESH",
         "changeSequence":"",
         "timestamp":"",
         "streamPosition":"",
         "transactionId":""
      }
   }
}

我认为由于复制软件生成的更复杂的JSON模型而报告错误。那么我怎样才能将这个JSON转换成适当的格式,以便Kafka接收器能够解析它?

1 个答案:

答案 0 :(得分:0)

为了能够将数据放入RDBMS,您需要一个架构。 c.f.

  

接收器需要了解架构,因此您应该使用合适的转换器,例如:架构注册表附带的Avro转换器,或启用了架构的JSON转换器

您的数据没有在其中声明的架构。

要声明消息的架构,您必须(a)将Avro与Schema Registry一起使用,或者(b)在this Kafka Connect format

中使用嵌入式架构的JSON

如果您的数据源(来自它的声音的第三方CDC工具)无法满足此要求,那么您无法将数据置于目标RDBMS。

您可以使用管道另一端的JDBC Connector 来证明这一切原则上都有效,因为JDBC Connector将同时写入Avro和嵌入式架构JSON。 / p>