流中的null ROWKEY在KSQL join语句中导致NullPointerException

时间:2018-02-14 11:13:08

标签: apache-kafka ksql

我已经从kafka主题创建了一个流,该主题具有以下结构:

ksql> describe trans_live2;

 Field       | Type                      
-----------------------------------------
 ROWTIME     | BIGINT           (system) 
 ROWKEY      | VARCHAR(STRING)  (system) 
 ID          | INTEGER                   
 DESCRIPTION | VARCHAR(STRING)           
 AMOUNT      | DOUBLE                    
 CURRENCYID  | INTEGER                   
-----------------------------------------

当一个新行被添加到MySQL表时,源连接器在Apache Kafka中发送该行,而该行又在trans_live2中流式传输。

例如,在MySQL中运行:

insert into transactions values(15, 'test15', 10.05, 1);

KSQL将包含:

select * from trans_live2;
1518606166292 | null | 15 | test15 | 10.05 | 1

但我不知道为什么ROWKEY为null。

我也尝试使用表latest

加入此流
ksql> describe latest;

 Field        | Type                      
------------------------------------------
 ROWTIME      | BIGINT           (system) 
 ROWKEY       | VARCHAR(STRING)  (system) 
 CURRENCYID   | INTEGER          (key)    
 MIDPRICE     | DOUBLE           (key)    
 MAXTIMESTAMP | BIGINT                    
------------------------------------------

使用此声明

CREATE stream live_transactions_stream3 AS SELECT t1.id, t1.description, t1.amount, t1.currencyid, t2.midprice, t2.maxtimestamp FROM trans_live2 t1 LEFT JOIN LATEST t2 on t1.currencyid = t2.currencyid;

但是我收到以下错误:

Exception in thread "ksql_query_CSAS_LIVE_TRANSACTIONS_STREAM3-0d676326-237b-4320-9dab-542b42a960d9-StreamThread-161" org.apache.kafka.streams.errors.StreamsException: Exception caught in process. taskId=0_3, processor=KSTREAM-SOURCE-0000000012, topic=ksql_query_CSAS_LIVE_TRANSACTIONS_STREAM3-KSTREAM-MAP-0000000009-repartition, partition=3, offset=0
    at org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:238)
    at org.apache.kafka.streams.processor.internals.AssignedStreamsTasks.process(AssignedStreamsTasks.java:94)
    at org.apache.kafka.streams.processor.internals.TaskManager.process(TaskManager.java:422)
    at org.apache.kafka.streams.processor.internals.StreamThread.processAndMaybeCommit(StreamThread.java:924)
    at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:804)
    at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:756)
    at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:726)
Caused by: org.apache.kafka.common.errors.SerializationException: org.apache.kafka.common.errors.SerializationException: Error serializing Avro message
Caused by: org.apache.kafka.common.errors.SerializationException: Error serializing Avro message
Caused by: java.lang.NullPointerException: null of double in field MIDPRICE of ksql.avro_schema

我猜是由null rowkey引起的。

我想问一下这些异常是否与我的流的null ROWKEY相关 - 如果是这样,我该如何解决这个问题。

1 个答案:

答案 0 :(得分:2)

这是一个已知错误,已在以下PR中修复: https://github.com/confluentinc/ksql/pull/679 问题是生成的结果avro架构不允许字段为空值。使用上面的PR,您将能够为字段设置空值。