我正在使用Debezium MongoDB连接器来侦听特定的MongoDB集合,以使每个条目作为消息出现在kafka主题中。 使用以下kafka connect配置可以正常工作:
{
"name": "mongo-source-connector",
"config": {
"connector.class": "io.debezium.connector.mongodb.MongoDbConnector",
"mongodb.hosts": "192.168.0.151:27017",
"mongodb.name": "mongo",
"database.whitelist": "database",
"tasks.max": 1,
"max.batch.size": 2048,
"poll.interval.ms": 5000,
"collection.whitelist": "database.collection"
}
}
使用此配置,每条Kafka消息都有来自MongoDB的原始数据记录的ID。现在,我正在尝试实现密钥转换,以从JSON文档中的字段获取特定值作为kafka中的消息密钥。原因是应该使用此字段对数据进行分区。
我已经尝试使用以下配置来创建密钥:
{
"name": "mongo-source-connector",
"config": {
"connector.class": "io.debezium.connector.mongodb.MongoDbConnector",
"mongodb.hosts": "192.168.0.151:27017",
"mongodb.name": "mongo",
"database.whitelist": "database",
"tasks.max": 1,
"max.batch.size": 2048,
"poll.interval.ms": 5000,
"collection.whitelist": "database.collection",
"transforms":"createKey",
"transforms.createKey.type": "org.apache.kafka.connect.transforms.ValueToKey",
"transforms.createKey.fields": "specific-field-in-mongodb-source-record"
}
}
然后我仅在Kafka Connect中收到此错误:
[2019-10-10 11:35:44,049] INFO 2048 records sent for replica set 'dev-shard-01', last offset: {sec=1570707340, ord=1, initsync=true, h=-8774414475389548112} (io.debezium.connector.mongodb.MongoDbConnectorTask)
[2019-10-10 11:35:44,050] INFO WorkerSourceTask{id=mongo-source-connector-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSourceTask)
[2019-10-10 11:35:44,050] INFO WorkerSourceTask{id=mongo-source-connector-0} flushing 0 outstanding messages for offset commit (org.apache.kafka.connect.runtime.WorkerSourceTask)
[2019-10-10 11:35:44,050] ERROR WorkerSourceTask{id=mongo-source-connector-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask)
java.lang.NullPointerException
at org.apache.kafka.connect.transforms.ValueToKey.applyWithSchema(ValueToKey.java:85)
at org.apache.kafka.connect.transforms.ValueToKey.apply(ValueToKey.java:65)
at org.apache.kafka.connect.runtime.TransformationChain.apply(TransformationChain.java:38)
at org.apache.kafka.connect.runtime.WorkerSourceTask.sendRecords(WorkerSourceTask.java:218)
at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:194)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2019-10-10 11:35:44,050] ERROR WorkerSourceTask{id=mongo-source-connector-0} Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask)
我尝试过的另一种配置如下:
{
"name": "mongo-source-connector",
"config": {
"connector.class": "io.debezium.connector.mongodb.MongoDbConnector",
"mongodb.hosts": "192.168.0.151:27017",
"mongodb.name": "mongo",
"database.whitelist": "database",
"tasks.max": 1,
"max.batch.size": 2048,
"poll.interval.ms": 5000,
"collection.whitelist": "database.collection",
"transforms": "unwrap,insertKey,extractKey",
"transforms.unwrap.type": "io.debezium.transforms.UnwrapFromEnvelope",
"transforms.unwrap.drop.tombstones": "false",
"transforms.insertKey.type": "org.apache.kafka.connect.transforms.ValueToKey",
"transforms.insertKey.fields": "specific-field-in-mongodb-source-record",
"transforms.extractKey.type": "org.apache.kafka.connect.transforms.ExtractField$Key",
"transforms.extractKey.field": "specific-field-in-mongodb-source-record",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"key.converter.schemas.enable": "true",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable": "false"
}
}
这也会导致错误:
[2019-10-10 12:27:04,915] ERROR WorkerSourceTask{id=mongo-source-connector-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask)
org.apache.kafka.connect.errors.DataException: Only Struct objects supported for [copying fields from value to key], found: java.lang.String
at org.apache.kafka.connect.transforms.util.Requirements.requireStruct(Requirements.java:52)
at org.apache.kafka.connect.transforms.ValueToKey.applyWithSchema(ValueToKey.java:79)
at org.apache.kafka.connect.transforms.ValueToKey.apply(ValueToKey.java:65)
at org.apache.kafka.connect.runtime.TransformationChain.apply(TransformationChain.java:38)
at org.apache.kafka.connect.runtime.WorkerSourceTask.sendRecords(WorkerSourceTask.java:218)
at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:194)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2019-10-10 12:27:04,915] ERROR WorkerSourceTask{id=mongo-source-connector-0} Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask)
有人知道是否以及如何将JSON文档中的元素从MongoDB中转换为Kafka消息密钥?
谢谢!
答案 0 :(得分:1)
经过更多测试后,我找到了合适的解决方案。 我发现我不需要第三次转换。只需使用ValueToKey转换就足够了。
为完整起见,以下是工作配置:
{
"name": "mongo-source-connector",
"config": {
"connector.class": "io.debezium.connector.mongodb.MongoDbConnector",
"mongodb.hosts": "192.168.0.151:27017",
"mongodb.name": "mongo",
"database.whitelist": "database",
"tasks.max": 1,
"max.batch.size": 2048,
"poll.interval.ms": 5000,
"collection.whitelist": "database.collection",
"transforms": "unwrap,insertKey",
"transforms.unwrap.type": "io.debezium.connector.mongodb.transforms.UnwrapFromMongoDbEnvelope",
"transforms.unwrap.drop.tombstones": "false",
"transforms.unwrap.delete.handling.mode":"drop",
"transforms.unwrap.operation.header":"true",
"transforms.insertKey.type": "org.apache.kafka.connect.transforms.ValueToKey",
"transforms.insertKey.fields": "specific-field-in-mongodb-source-record",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"key.converter.schemas.enable": "false",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable": "false"
}
}