如何在kafka中为日期提供字段类型支持

时间:2019-07-02 13:46:11

标签: mongodb apache-kafka apache-kafka-connect debezium

使用debezium-mongodb-connector我设法将收藏集推送到kafka,我面临的唯一问题是我的其中一个收藏夹中的字段date的格式为2019-05-14T23:25: 34.703 + 00:00,不是以相同的格式推送到该主题,而是我得到了类似于1560708085175的内容。

这是我的debezium连接器命令connect-standalone /etc/kafka/connect-standalone.properties /etc/kafka/connect-mongodb-source.properties 这是我的mongodb收集示例。

{"_id":"5cdb4e6ed767ba70593e2aa8","sender":"5cdb43db4505956efc70ba03","receiver":"5cdb43db4505956efc70ba03","receiverWalletId":"5cdb43db4505956efc70ba04","status":"succes","type":"topup","amount":200000,"totalFee":0,"createdAt":"2019-05-14T23:25:34.703Z","updatedAt":"2019-05-14T23:25:35.132Z","__v":0,"details":"none."}

这是我的kafka主题示例。

{"schema":{"type":"struct","fields":[{"type":"string","optional":true,"field":"sender"},{"type":"string","optional":true,"field":"receiver"},{"type":"string","optional":true,"field":"receiverWalletId"},{"type":"string","optional":true,"field":"status"},{"type":"string","optional":true,"field":"type"},{"type":"int32","optional":true,"field":"amount"},{"type":"int32","optional":true,"field":"totalFee"},{"type":"int64","optional":true,"field":"createdAt"},{"type":"int64","optional":true,"field":"updatedAt"},{"type":"int32","optional":true,"field":"__v"},{"type":"string","optional":true,"field":"from"},{"type":"string","optional":true,"field":"orderId"},{"type":"string","optional":true,"field":"id"}],"optional":false,"name":"mongo_conn.digi.transactions"},"payload":{"sender":"5cef970ca2e9c273c655483","receiver":"5cef970ca2e9c27355c483","receiverWalletId":"5cef970ca2e9c27556c484","status":"pending","type":"topup","amount":6000,"totalFee":0,"createdAt":1560708024322,"updatedAt":1560708024753,"__v":0,"from":"smt","orderId":"d7a97581-9d18-79cd-8b09-16e400a43714","id":"5d0683b8be4af834abe3cf58"}}

这是我的connect-mongodb-source.properties

name=mongodb-source-connector
connector.class=io.debezium.connector.mongodb.MongoDbConnector
mongodb.hosts=repracli/**.**.**.***27017
mongodb.name=mongo_conn
initial.sync.max.threads=1
tasks.max=1
transforms=unwrap
transforms.unwrap.type=io.debezium.connector.mongodb.transforms.UnwrapFromMongo$
transforms.unwrap.operation.header=true

3 个答案:

答案 0 :(得分:0)

Debezium以存储在oplog中的格式流传输数据。日期看起来像是自时间戳记以来的Unix时间戳(以毫秒为单位)。

您可以编写一个SMT(https://cwiki.apache.org/confluence/display/KAFKA/KIP-66%3A+Single+Message+Transforms+for+Kafka+Connect)来处理消息,并将请求的字段转换为首选的字符串表示形式。

如果您查看org.bson.BsonDateTime,就会发现它确实是long的价值。

答案 1 :(得分:0)

已解决

name=mongodb-source-connector
connector.class=io.debezium.connector.mongodb.MongoDbConnector
mongodb.hosts=repracli/**.**.**.***:27017
mongodb.name=mongo_conn
initial.sync.max.threads=1
tasks.max=1
transforms=unwrap,convert,convert2,convert3,convert4
transforms.unwrap.type=io.debezium.connector.mongodb.transforms.UnwrapFromMongoDbEnvelope
transforms.unwrap.operation.header=true
transforms.convert.type=org.apache.kafka.connect.transforms.TimestampConverter$Value
transforms.convert.target.type=string
transforms.convert.field=createdAt
transforms.convert.format=yyyy-MM-dd HH:mm:ss ZZZ
transforms.convert2.type=org.apache.kafka.connect.transforms.TimestampConverter$Value
transforms.convert2.target.type=string
transforms.convert2.field=updatedAt
transforms.convert2.format=yyyy-MM-dd HH:mm:ss ZZZ
transforms.convert3.type=org.apache.kafka.connect.transforms.TimestampConverter$Value
transforms.convert3.target.type=string
transforms.convert3.field=created_at
transforms.convert3.format=yyyy-MM-dd HH:mm:ss ZZZ
transforms.convert4.type=org.apache.kafka.connect.transforms.TimestampConverter$Value
transforms.convert4.target.type=string
transforms.convert4.field=updated_at
transforms.convert4.format=yyyy-MM-dd HH:mm:ss ZZZ

答案 2 :(得分:0)

对于一些转换,您将需要:

transforms=unwrap,convert1,convert2
transforms.unwrap.type=io.debezium.connector.mongodb.transforms.UnwrapFromMongoDbEnvelope
transforms.unwrap.operation.header=true
transforms.convert1.type=org.apache.kafka.connect.transforms.TimestampConverter$Value
transforms.convert1.target.type=string
transforms.convert1.field=createdAt
transforms.convert1.format=yyyy-MM-dd HH:mm:ss ZZZ
transforms.convert2.type=org.apache.kafka.connect.transforms.TimestampConverter$Value
transforms.convert2.target.type=string
transforms.convert2.field= *updatedAt*
transforms.convert2.format=yyyy-MM-dd HH:mm:ss ZZZ