Kafka 接收器连接器 --> postgres,因 avro JSON 数据而失败

时间:2021-02-14 17:12:56

标签: apache-kafka avro apache-kafka-connect confluent-platform confluent-schema-registry

我设置了一个 Kafka JDBC 接收器来将事件发送到 PostgreSQL。 我编写了这个简单的生产者,它将带有架构 (avro) 数据的 JSON 发送到一个主题,如下所示:

producer.py (kafka-python)

biometrics = {
        "heartbeat": self.pulse, # integer
        "oxygen": self.oxygen,# integer
        "temprature": self.temprature, # float
        "time": time # string
    }

avro_value = {
               "schema": open(BASE_DIR+"/biometrics.avsc").read(),
               "payload": biometrics
             }

producer.send("biometrics",
                      key="some_string",
                      value=avro_value
                      )

价值架构:

{
    "type": "record",
    "name": "biometrics",
    "namespace": "athlete",
    "doc": "athletes biometrics"
    "fields": [
        {
            "name": "heartbeat",
            "type": "int",
            "default": 0
        },
        {
            "name": "oxygen",
            "type": "int",
            "default": 0
        },
        {
            "name": "temprature",
            "type": "float",
            "default": 0.0
        },
        {
            "name": "time",
            "type": "string"
            "default": ""
        }
    ]
}

连接器配置(没有主机、密码等)

{
    "name": "jdbc_sink",
    "connector.class": "io.aiven.connect.jdbc.JdbcSinkConnector",
    "tasks.max": "1",
    "key.converter": "org.apache.kafka.connect.storage.StringConverter ",
    "value.converter": "io.confluent.connect.avro.AvroConverter",
    "topics": "biometrics",
    "insert.mode": "insert",
    "auto.create": "true"
}

但我的连接器严重失败,出现三个错误,我无法找出其中任何一个的原因:

TL;DR;日志版本

(Error 1) Caused by: org.apache.kafka.connect.errors.DataException: biometrics
(Error 2) Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id -1
(Error 3) Caused by: org.apache.kafka.common.errors.SerializationException: Unknown magic byte!

完整日志

org.apache.kafka.connect.errors.ConnectException: Tolerance exceeded in error handler
    at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:206)
    at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:132)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.convertAndTransformRecord(WorkerSinkTask.java:498)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:475)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:325)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:229)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:201)
    at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:185)
    at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:235)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.kafka.connect.errors.DataException: biometrics
    at io.confluent.connect.avro.AvroConverter.toConnectData(AvroConverter.java:98)
    at org.apache.kafka.connect.storage.Converter.toConnectData(Converter.java:87)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.lambda$convertAndTransformRecord$1(WorkerSinkTask.java:498)
    at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:156)
    at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:190)
    ... 13 more
Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id -1
Caused by: org.apache.kafka.common.errors.SerializationException: Unknown magic byte!

有人能帮我理解这些错误和根本原因吗?

1 个答案:

答案 0 :(得分:1)

该错误是因为您需要在连接器中使用带有 class Fish (object): def __init__(self, name): self.name = name def swim(self): print(self.name + " swam.") class Salmon(Fish): pass flipper = Fish("FLIPPER") salmon = Salmon("FOO") Salmon.swim(flipper) Fish.swim(flipper) print(flipper.name) Fish.swim(flipper) Fish.swim(salmon) 的 JSONConverter 类,因为这是生成的,但是 FLIPPER swam. FLIPPER swam. FLIPPER FLIPPER swam. FOO swam. 负载不是一个 Avro value.converter.schemas.enabled=true 的模式表示,因此仅凭这些更改它可能仍然会失败...

如果您想真正发送 Avro,请使用 schema 库中的 AvroProducer,这需要运行 Schema Registry。