Question

我正在使用Kafka JDBC Connecter将数据从MySQL数据库导入到Kafka主题中。通过以下参数，我可以跟踪插入到给定表中的较新行。

name=test
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=10

connection.url=jdbc:mysql://localhost:3306/test?user=root&password=asdf
table.whitelist=test_table

mode=incrementing
incrementing.column.name=id

topic.prefix=test-

我需要的是从表中的特定ID之后加载所有记录，并跟踪所有新插入的记录。我怎样才能做到这一点？一种解决方案可能是使用过滤进行自定义查询，但我不确定查询。

Answer 1

自定义查询应该是＆＃34;从表格中选择*，其中id＆gt; X＆＃34; X是您提到的具体ID。

Answer 2

我以前没有这样做过。但我仍然认为这是可行的，显然需要一些代码更改。在JdbcSourceTask.start方法中，使用下面的代码加载偏移量。

offsets = context.offsetStorageReader().offsets(partitions);

您可以在此定义自己的偏移量。但是，有一个问题。每次重新启动连接器时都会加载此偏移量，而不是在主题中保存的连接器。此外，要解决此问题，您可以定义如下的自定义配置。

 connector.firsttime=true

然后可以在启动方法中使用相同的内容，如下所示：

String strIsFirstTime
    = config.getString(JdbcSourceTaskConfig.FIRST_TIME_CONFIG);
if("true".equals(strIsFirstTime)){
//load custom offset
//lStartingPosition is the value at which you want to start the processing.
Long lStartingPosition=Long.MAX_VALUE;
//partition is the relevant partiton of the table in question.
  offsets.put(partition, new TimestampIncrementingOffset(null,lStartingPosition).toMap()); 
}
else{
offsets = context.offsetStorageReader().offsets(partitions);
}

但是，请记住，只要重新启动此连接器，就将此自定义配置设置为false。

让我知道它是否有效。

Answer 3

完成此操作的另一种方法是为自定义查询创建一个视图，并在谓词中添加一个过滤器。

create or replace view xyz as select * from table where id > X;


name=test
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=10

connection.url=jdbc:mysql://localhost:3306/test?user=root&password=asdf
table.whitelist=test_table

mode=incrementing
incrementing.column.name=id

topic.prefix=test-

poll.interval.ms : 300000,
query: "select id from xyz"

如何在特定id之后加载行并使用Kafka JDBC Connector跟踪更新的行？

3 个答案: