MongoDB Kafka Connector无法使用Mongo文档ID生成消息密钥

时间:2019-07-26 11:31:33

标签: mongodb apache-kafka apache-kafka-connect mongodb-kafka-connector

我正在使用MongoDB Kafka Connector to publish from MongoDB to a Kafka topic.的beta版本

消息已生成到Kafka中,但当其键应为文档ID时,它们的键为空:

enter image description here

这是我的连接独立配置

bootstrap.servers=xxx:9092

# The converters specify the format of data in Kafka and how to translate it into Connect data. Every Connect user will
# need to configure these based on the format they want their data in when loaded from or stored into Kafka
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# Converter-specific settings can be passed in by prefixing the Converter's setting with the converter you want to apply
# it to
key.converter.schemas.enable=false
value.converter.schemas.enable=false

# The internal converter used for offsets and config data is configurable and must be specified, but most users will
# always want to use the built-in default. Offset and config data is never visible outside of Kafka Connect in this format.
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false

以及 mongodb源属性

name=mongo-source
connector.class=com.mongodb.kafka.connect.MongoSourceConnector
tasks.max=1

# Connection and source configuration
connection.uri=mongodb+srv://xxx
database=mydb
collection=mycollection

topic.prefix=someprefix
poll.max.batch.size=1000
poll.await.time.ms=5000

# Change stream options
pipeline=[]
batch.size=0
change.stream.full.document=updateLookup
collation=

下面有一个消息字符串值的示例:

"{\"_id\": {\"_data\": \"xxx\"}, \"operationType\": \"replace\", \"clusterTime\": {\"$timestamp\": {\"t\": 1564140389, \"i\": 1}}, \"fullDocument\": {\"_id\": \"5\", \"name\": \"Some Client\", \"clientId\": \"someclient\", \"clientSecret\": \"1234\", \"whiteListedIps\": [], \"enabled\": true, \"_class\": \"myproject.Client\"}, \"ns\": {\"db\": \"mydb\", \"coll\": \"mycollection\"}, \"documentKey\": {\"_id\": \"5\"}}"

我尝试使用转换从值(尤其是从documentKey字段)中提取if:

transforms=InsertKey
transforms.InsertKey.type=org.apache.kafka.connect.transforms.ValueToKey
transforms.InsertKey.fields=documentKey

但是有一个例外:

Caused by: org.apache.kafka.connect.errors.DataException: Only Struct objects supported for [copying fields from value to key], found: java.lang.String
    at org.apache.kafka.connect.transforms.util.Requirements.requireStruct(Requirements.java:52)
    at org.apache.kafka.connect.transforms.ValueToKey.applyWithSchema(ValueToKey.java:79)
    at org.apache.kafka.connect.transforms.ValueToKey.apply(ValueToKey.java:65)

有什么想法可以生成带有文档ID的密钥吗?

2 个答案:

答案 0 :(得分:0)

根据异常抛出:

Caused by: org.apache.kafka.connect.errors.DataException: Only Struct objects supported for [copying fields from value to key], found: java.lang.String
    at org.apache.kafka.connect.transforms.util.Requirements.requireStruct(Requirements.java:52)
    at org.apache.kafka.connect.transforms.ValueToKey.applyWithSchema(ValueToKey.java:79)
    at org.apache.kafka.connect.transforms.ValueToKey.apply(ValueToKey.java:65)

很遗憾,您使用的Mongo DB connector不会创建正确的模式

在上述连接器上,创建键和值模式为String的Record。 检查以下行:How record is created by connector。这就是为什么您不能对其应用转换的原因

答案 1 :(得分:0)

在1.3.0版中应该支持: https://jira.mongodb.org/browse/KAFKA-40