kafka jdbc源连接器:时间戳模式不适用于sqlite3

时间:2019-06-18 10:14:56

标签: sqlite jdbc apache-kafka apache-kafka-connect confluent

我试图用带有timestamp列的表来建立数据库。我正在尝试实现时间戳模式以捕获数据库中的增量更改。

但是kafka-connect-jdbc没有从表中读取任何数据。这是我所做的。

创建了表格。

sqlite> CREATE TABLE test_timestamp(id integer primary key not null,
   ...>                   payment_type text not null,
   ...>                   Timestamp DATETIME DEFAULT CURRENT_TIMESTAMP,
   ...>                   user_id int not null);
sqlite> INSERT INTO test_timestamp (ID, PAYMENT_TYPE, USER_ID) VALUES (3,'FOO',1);
sqlite> select * from test_timestamp;
3|FOO|2019-06-18 05:31:22|1

我的jdbc-source连接器配置如下:

$ curl -s "http://localhost:8083/connectors/jdbc-source/config"|jq '.'
{
  "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
  "mode": "timestamp",
  "timestamp.column.name": "timestamp",
  "topic.prefix": "testdb-",
  "validate.non.null": "false",
  "tasks.max": "1",
  "name": "jdbc-source",
  "connection.url": "jdbc:sqlite:/tmp/test.db"
}

jdbc-source-connector成功加载并创建主题

$ kafka-topics --list --bootstrap-server localhost:9092
..
testdb-test_timestamp

但是该主题中没有数据。

有帮助吗?

谢谢。

2 个答案:

答案 0 :(得分:0)

您遇到的是已知问题,详细信息在这里:https://github.com/confluentinc/kafka-connect-jdbc/issues/219

复制步骤:

  1. 创建数据库:

    $ echo 'DROP TABLE test_timestamp; CREATE TABLE test_timestamp(id integer primary key not null,
                     payment_type text not null,
                     Timestamp DATETIME DEFAULT CURRENT_TIMESTAMP,
                     user_id int not null);
    INSERT INTO test_timestamp (ID, PAYMENT_TYPE, USER_ID) VALUES (3,\'FOO\',1);
    select * from test_timestamp;' | sqlite3 /tmp/test.db
    3|FOO|2019-07-03 08:28:43|1
    
  2. 创建连接器

    curl -X PUT "http://localhost:8083/connectors/jdbc-source/config" -H  "Content-Type:application/json"  -d '{
    "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
    "mode": "timestamp",
    "timestamp.column.name": "timestamp",
    "topic.prefix": "testdb-",
    "validate.non.null": "false",
    "tasks.max": "1",
    "name": "jdbc-source",
    "connection.url": "jdbc:sqlite:/tmp/test.db"
    }'
    
  3. 检查连接器的状态

    $ curl -s "http://localhost:8083/connectors"| \
    jq '.[]'| \
    xargs -I{connector_name} curl -s "http://localhost:8083/connectors/"{connector_name}"/status"| \
    jq -c -M '[.name,.connector.state,.tasks[].state]|join(":|:")'| \
    column -s : -t| sed 's/\"//g'| sort
    jdbc-source  |  RUNNING  |  RUNNING
    
  4. 根据issue 219

    检查Kafka Connect工作日志,观察错误解析日期
    [2019-07-03 10:40:58,260] ERROR Failed to run query for table TimestampIncrementingTableQuerier{table="test_timestamp", query='null', topicPrefix='testdb-', incrementingColumn='', timestampColumns=[timestamp]}: {} (io.confluent.connect.jdbc.source.JdbcSourceTask:332)
    java.sql.SQLException: Error parsing time stamp
        at org.sqlite.jdbc3.JDBC3ResultSet.getTimestamp(JDBC3ResultSet.java:576)
        at io.confluent.connect.jdbc.dialect.GenericDatabaseDialect.currentTimeOnDB(GenericDatabaseDialect.java:484)
        at io.confluent.connect.jdbc.source.TimestampIncrementingTableQuerier.endTimetampValue(TimestampIncrementingTableQuerier.java:203)
        at io.confluent.connect.jdbc.source.TimestampIncrementingCriteria.setQueryParametersTimestamp(TimestampIncrementingCriteria.java:164)
        at io.confluent.connect.jdbc.source.TimestampIncrementingCriteria.setQueryParameters(TimestampIncrementingCriteria.java:126)
        at io.confluent.connect.jdbc.source.TimestampIncrementingTableQuerier.executeQuery(TimestampIncrementingTableQuerier.java:176)
        at io.confluent.connect.jdbc.source.TableQuerier.maybeStartQuery(TableQuerier.java:92)
        at io.confluent.connect.jdbc.source.TimestampIncrementingTableQuerier.maybeStartQuery(TimestampIncrementingTableQuerier.java:60)
        at io.confluent.connect.jdbc.source.JdbcSourceTask.poll(JdbcSourceTask.java:310)
        at org.apache.kafka.connect.runtime.WorkerSourceTask.poll(WorkerSourceTask.java:245)
        at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:221)
        at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
        at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
    Caused by: java.text.ParseException: Unparseable date: "2019-07-03 08:40:58" does not match (\p{Nd}++)\Q-\E(\p{Nd}++)\Q-\E(\p{Nd}++)\Q \E(\p{Nd}++)\Q:\E(\p{Nd}++)\Q:\E(\p{Nd}++)\Q.\E(\p{Nd}++)
        at org.sqlite.date.FastDateParser.parse(FastDateParser.java:299)
        at org.sqlite.date.FastDateFormat.parse(FastDateFormat.java:490)
        at org.sqlite.jdbc3.JDBC3ResultSet.getTimestamp(JDBC3ResultSet.java:573)
        ... 17 more
    

答案 1 :(得分:0)

偶然发现类似问题。就我而言,甚至没有创建主题。发现connect worker的时区必须与DB的时区相同。根据此list(在SHORT_IDS部分下),在connect worker的属性文件中正确设置db.timezone属性可以使其起作用:

db.timezone=Asia/Kolkata