如何使用转换创建用于创建分布式Kafka Connect实例的json?

时间:2017-09-08 14:43:36

标签: apache-kafka apache-kafka-connect

使用独立模式我创建了一个连接器和我的自定义转换:

name=rabbitmq-source
connector.class=com.github.jcustenborder.kafka.connect.rabbitmq.RabbitMQSourceConnector
tasks.max=1
rabbitmq.host=rabbitmq-server
rabbitmq.queue=answers
kafka.topic=net.gutefrage.answers
transforms=extractFields
transforms.extractFields.type=net.gutefrage.connector.transforms.ExtractFields$Value
transforms.extractFields.fields=body,envelope.routingKey
transforms.extractFields.structName=net.gutefrage.events

但是对于分布式连接器,对Connect REST API的PUT请求的语法是什么?我在文档中找不到任何示例。

已经尝试了几件事:

cat <<EOF >/tmp/connector
{
  "name": "rabbitmq-source",
  "config": {
    "connector.class": "com.github.jcustenborder.kafka.connect.rabbitmq.RabbitMQSourceConnector",
    "tasks.max": "1",
    "rabbitmq.host": "rabbitmq-server",
    "rabbitmq.queue": "answers",
    "kafka.topic": "net.gutefrage.answers",
    "transforms": "extractFields",
    "transforms.extractFields": {
      "type": "net.gutefrage.connector.transforms.ExtractFields$Value",
      "fields": "body,envelope.routingKey",
      "structName": "net.gutefrage.events"
    }
  }
}
EOF

curl -vs --stderr - -X POST -H "Content-Type: application/json" --data @/tmp/connector "http://localhost:8083/connectors"
rm /tmp/connector

或者这也不起作用:

{
  "name": "rabbitmq-source",
  "config": {
    "connector.class": "com.github.jcustenborder.kafka.connect.rabbitmq.RabbitMQSourceConnector",
    "tasks.max": "1",
    "rabbitmq.host": "rabbitmq-server",
    "rabbitmq.queue": "answers",
    "kafka.topic": "net.gutefrage.answers",
    "transforms": "extractFields",
    "transforms.extractFields.type": "net.gutefrage.connector.transforms.ExtractFields$Value",
    "transforms.extractFields.fields": "body,envelope.routingKey",
    "transforms.extractFields.structName": "net.gutefrage.events"
  }
}

对于最后一个变种,我收到以下错误:

{"error_code":400,"message":"Connector configuration is invalid and contains the following 1 error(s):\nInvalid value class net.gutefrage.connector.transforms.ExtractFields for configuration transforms.extractFields.type: Error getting config definition from Transformation: null\nYou can also find the above list of errors at the endpoint `/{connectorType}/config/validate`"}

请注意,使用属性格式它可以很好地工作(使用Landoops在快速数据开发中创建新的连接器UI。有趣的是Landoop的Ui功能&#39;转换为卷曲&#39;产生非常和我的第二个例子一样json)

更新

为了确保Landoop,docker和我的自定义转换不存在问题,我已经使用COP的标准分布式属性以分布式模式启动了zookeeper,broker,schema registry和Kafka Connect 3.3.0

bin/connect-distributed etc/schema-registry/connect-avro-distributed.properties

哪些日志 [2017-09-13 14:07:52,930] INFO Loading plugin from: /opt/connectors/confluent-oss-gf-assembly-1.0.jar (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:176) [2017-09-13 14:07:53,711] INFO Registered loader: PluginClassLoader{pluginLocation=file:/opt/connectors/confluent-oss-gf-assembly-1.0.jar} (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:199) [2017-09-13 14:07:53,711] INFO Added plugin 'com.github.jcustenborder.kafka.connect.rabbitmq.RabbitMQSourceConnector' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:132) [2017-09-13 14:07:53,712] INFO Added plugin 'net.gutefrage.connector.transforms.ExtractFields$Key' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:132) [2017-09-13 14:07:53,712] INFO Added plugin 'net.gutefrage.connector.transforms.ExtractFields$Value' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:132) 到目前为止都很好。然后我创建了一个连接器配置:

cat <<EOF >/tmp/connector
{ "name": "rabbitmq-source", "config": { "connector.class": "com.github.jcustenborder.kafka.connect.rabbitmq.RabbitMQSourceConnector", "tasks.max": "1", "rabbitmq.host": "rabbitmq-server", "rabbitmq.queue": "answers", "kafka.topic": "net.gutefrage.answers", "transforms": "extractFields", "transforms.extractFields.type": "org.apache.kafka.connect.transforms.ExtractField$Value", "transforms.extractFields.field": "body" } } EOF
请注意,我现在使用标准(捆绑)提取字段转换。 当我用curl -vs --stderr - -X POST -H "Content-Type: application/json" --data @/tmp/connector "http://localhost:8083/connectors" 发布时 我得到了相同的

{"error_code":400,"message":"Connector configuration is invalid and contains the following 1 error(s):\nInvalid value class org.apache.kafka.connect.transforms.ExtractField for configuration transforms.extractFields.type: Error getting config definition from Transformation: null\nYou can also find the above list of errors at the endpoint `/{connectorType}/config/validate`"}*

4 个答案:

答案 0 :(得分:1)

如果要在独立模式下运行Kafka Connect工作线程,则必须启动工作线程并提供工作线程配置文件一个或多个连接器配置文件。所有这些配置文件都是Java属性格式,因此您提供的第一个配置示例格式正确:

name=rabbitmq-source
connect.class=com.github.jcustenborder.kafka.connect.rabbitmq.RabbitMQSourceConnector
tasks.max=1
rabbitmq.host=rabbitmq-server
rabbitmq.queue=answers
kafka.topic=net.gutefrage.answers
transforms=extractFields
transforms.extractFields.type=net.gutefrage.connector.transforms.ExtractFields$Value
transforms.extractFields.fields=body,envelope.routingKey
transforms.extractFields.structName=net.gutefrage.events

如果要以分布式模式运行Kafka Connect工作线程,则必须首先启动分布式工作线程,然后使用the REST API和a创建连接器作为第二步PUT请求使用JSON文档到/connectors端点。该JSON文档将与您的第二个JSON文档的格式相匹配:

{
  "name": "rabbitmq-source",
  "config": {
    "connector.class": "com.github.jcustenborder.kafka.connect.rabbitmq.RabbitMQSourceConnector",
    "tasks.max": "1",
    "rabbitmq.host": "rabbitmq-server",
    "rabbitmq.queue": "answers",
    "kafka.topic": "net.gutefrage.answers",
    "transforms": "extractFields",
    "transforms.extractFields.type": "net.gutefrage.connector.transforms.ExtractFields$Value",
    "transforms.extractFields.fields": "body,envelope.routingKey",
    "transforms.extractFields.structName": "net.gutefrage.events"
  }
}

包含在Kafka的Confluent开源平台中的Confluent CLI是一个开发人员工具,可以通过运行Zookeeper实例,Kafka代理,Confluent Schema Registry,REST代理来帮助您快速入门。和分布式模式下的Connect工作者。加载连接器时,将连接器配置指定为JSON文件或属性文件,使用jq将后者转换为JSON格式。

但是,您报告的错误是:

{
  "error_code":400,
  "message":"Connector configuration is invalid and contains the following 1 error(s):\nInvalid value class net.gutefrage.connector.transforms.ExtractFields for configuration transforms.extractFields.type: Error getting config definition from Transformation: null\nYou can also find the above list of errors at the endpoint `/{connectorType}/config/validate`"
}

此错误消息的重要部分是&#34;从Transformation:null&#34;获取配置定义时出错。虽然这有点过于神秘,但这意味着config() Java类的net.gutefrage.connector.transforms.ExtractFields方法返回null。

确保您指定的net.gutefrage.connector.transforms.ExtractFields$Value字符串是嵌套静态类Value的正确完全限定名称,并且Value类完全正确地实现org.apache.kafka.connect.transforms.Transformation<? extends ConnectRecord<R>>接口。请注意,config()方法必须返回非null ConfigDef对象。

查看Apache Kafka附带的单个消息转换(SMT)的this example,或Robin's blog post的其他示例。

答案 1 :(得分:1)

确保变换中的$ Value.extractFields.type = net.gutefrage.connector.transforms.ExtractFields $值不被bash命令cat解释为变量。它对我有用。

答案 2 :(得分:0)

要使用连接器配置和CP连接CLI的json格式,必须在运行Kafka-Connect群集的计算机上安装jq工具。

E.g。对于Landoops快速数据开发环境,您必须

docker exec rabbitmqconnect_fast-data-dev_1 apk add --no-cache jq

然后这将起作用:

docker exec rabbitmqconnect_fast-data-dev_1 /opt/confluent-3.3.0/bin/confluent config rabbitmq-source -d /tmp/connector-config.json

但是,在使用连接器REST端点时,这并没有解决问题。

答案 3 :(得分:0)

使用fast-data-dev,您可以为任何连接器构建JAR文件,然后只需将其添加到类路径中,并附上

中的说明

https://github.com/Landoop/fast-data-dev#enable-additional-connectors

用户界面将自动检测新连接器 - 并在您为新连接器点击NEW时提供说明:

http://localhost:3030/kafka-connect-ui

还有什么值得尝试的 - 因为fast-data-dev已经有了一个通用的MQTT接收器,正在试用它。请参阅http://docs.datamountaineer.com/en/latest/mqtt-sink.html

上的说明

你真的需要做 connect.mqtt.kcql=INSERT INTO /answers SELECT body FROM net.gutefrage.answers

由于这是一个通用的MQTT连接器 - 您可能需要使用enable-additional-connectors指令添加rabbitmq客户端库