我们在生产中使用kafka,我尝试朝同一方向推动KSQL的采用和使用。但是我已经通过一个简单的表-表联接失败了。我首先尝试了我们的生产数据,但遇到了问题。所以我以为我错过了一些东西,然后从融合的文档中移回示例,并遇到了同样的问题。 我将用示例数据https://docs.confluent.io/current/ksql/docs/tutorials/basics-docker.html#table-table-join解释我的问题 当我创建了两个表并尝试连接数据时,它就起作用了,但是当我尝试更改或添加某些内容时,就会在表中获得新的条目。从我发现的所有示例中,我发现在融合视频甚至是youtube视频中都不会发生这种情况。
创建记录
docker run --interactive --rm --network tutorials_default \
confluentinc/cp-kafkacat \
kafkacat -b kafka:39092 \
-t warehouse_location \
-K: \
-P <<EOF
1:{"warehouse_id":1,"city":"Leeds","country":"UK"}
2:{"warehouse_id":2,"city":"Sheffield","country":"UK"}
3:{"warehouse_id":3,"city":"Berlin","country":"Germany"}
EOF
docker run --interactive --rm --network tutorials_default \
confluentinc/cp-kafkacat \
kafkacat -b kafka:39092 \
-t warehouse_size \
-K: \
-P <<EOF
1:{"warehouse_id":1,"square_footage":16000}
2:{"warehouse_id":2,"square_footage":42000}
3:{"warehouse_id":3,"square_footage":94000}
EOF
创建表格
CREATE TABLE WAREHOUSE_LOCATION (WAREHOUSE_ID INT, CITY VARCHAR, COUNTRY VARCHAR)
WITH (KAFKA_TOPIC='warehouse_location',
VALUE_FORMAT='JSON',
KEY='WAREHOUSE_ID');
CREATE TABLE WAREHOUSE_SIZE (WAREHOUSE_ID INT, SQUARE_FOOTAGE DOUBLE)
WITH (KAFKA_TOPIC='warehouse_size',
VALUE_FORMAT='JSON',
KEY='WAREHOUSE_ID');
创建联接表:
CREATE TABLE WH_U AS SELECT WL.WAREHOUSE_ID, WL.CITY, WL.COUNTRY, WS.SQUARE_FOOTAGE
FROM WAREHOUSE_LOCATION WL
LEFT JOIN WAREHOUSE_SIZE WS
ON WL.WAREHOUSE_ID=WS.WAREHOUSE_ID;
有了这个,我得到了预期的结果:
1 | Leeds | UK | 16000.0
2 | Sheffield | UK | 42000.0
3 | Berlin | Germany | 94000.0
但是当我添加或更改记录时,会发生这种情况:
1566375174496 | 1 | 1 | Leeds | UK | 16000.0
1566375174496 | 2 | 2 | Sheffield | UK | 42000.0
1566375174496 | 3 | 3 | Berlin | Germany | 94000.0
1566375595372 | 4 | 4 | London | UK | null
1566375641291 | 4 | 4 | London | UK | 94000.0
1566375641291 | 1 | 1 | Leeds | UK | 1.0
我期望:
1566375174496 | 1 | 1 | Leeds | UK | 1.0
1566375174496 | 2 | 2 | Sheffield | UK | 42000.0
1566375174496 | 3 | 3 | Berlin | Germany | 94000.0
1566375641291 | 4 | 4 | London | UK | 94000.0
我想念什么?
已解决
此行为的原因是ksql服务器中有一个简单的env。 KSQL_CACHE_MAX_BYTES_BUFFERING设置为0