我正在使用Confluent社区和Postgres数据库,并且遇到了以下问题。
事件顺利流入kafka并创建主题。我创建了一个基于主题的流,并重新设置了密钥,因为密钥为空。
在重新生成密钥的流所基于的新主题之外,我创建了一个表。目标是拥有一个不断更新的对象表(此处为类别)。
问题是,当我在数据库中执行手动UPDATE时,表永远不会使用新数据进行更新。这些行只是像流一样不断添加。当然,我再次进行选择是因为我知道当我们仍在运行查询时,会显示“更新”行。
ksql> select * from categories;
1568287458487 | 1 | 1 | Beverages | Soft drinks, coffees, teas, beers, and ales
1568287458487 | 2 | 2 | Condiments | Sweet and savory sauces, relishes, spreads, and seasonings
1568287458488 | 3 | 3 | Confections | Desserts, candies, and sweet breads
1568287458488 | 4 | 4 | Dairy Products | Cheeses
1568287458488 | 5 | 5 | Grains/Cereals | Breads, crackers, pasta, and cereal
1568287458488 | 6 | 6 | Meat/Poultry | Prepared meats
1568287458489 | 7 | 7 | Produce | Dried fruit and bean curd
1568287458489 | 8 | 8 | Seafood | Seaweed and fish
1568288647248 | 8 | 8 | Seafood2 | Seaweed and fish
1568290562250 | 1 | 1 | asdf | Soft drinks, coffees, teas, beers, and ales
1568296165250 | 8 | 8 | Seafood3 | Seaweed and fish
1568296704747 | 8 | 8 | Seafood4 | Seaweed and fish
^CQuery terminated
ksql> select * from categories;
1568287458487 | 1 | 1 | Beverages | Soft drinks, coffees, teas, beers, and ales
1568287458487 | 2 | 2 | Condiments | Sweet and savory sauces, relishes, spreads, and seasonings
1568287458488 | 3 | 3 | Confections | Desserts, candies, and sweet breads
1568287458488 | 4 | 4 | Dairy Products | Cheeses
1568287458488 | 5 | 5 | Grains/Cereals | Breads, crackers, pasta, and cereal
1568287458488 | 6 | 6 | Meat/Poultry | Prepared meats
1568287458489 | 7 | 7 | Produce | Dried fruit and bean curd
1568287458489 | 8 | 8 | Seafood | Seaweed and fish
1568288647248 | 8 | 8 | Seafood2 | Seaweed and fish
1568290562250 | 1 | 1 | asdf | Soft drinks, coffees, teas, beers, and ales
1568296165250 | 8 | 8 | Seafood3 | Seaweed and fish
1568296704747 | 8 | 8 | Seafood4 | Seaweed and fish
^CQuery terminated
ksql>
postgres中的类别表:
CREATE TABLE categories (
category_id smallint NOT NULL,
category_name character varying(15) NOT NULL,
description text
);
KSQL中的类别表:
ksql> describe extended categories;
Name : CATEGORIES
Type : TABLE
Key field : CATEGORY_ID_ST
Key format : STRING
Timestamp field : Not set - using <ROWTIME>
Value format : AVRO
Kafka topic : categories_rk (partitions: 1, replication: 1)
Field | Type
--------------------------------------------
ROWTIME | BIGINT (system)
ROWKEY | VARCHAR(STRING) (system)
CATEGORY_ID_ST | VARCHAR(STRING)
CATEGORY_NAME | VARCHAR(STRING)
DESCRIPTION | VARCHAR(STRING)
MESSAGETOPIC | VARCHAR(STRING)
MESSAGESOURCE | VARCHAR(STRING)
--------------------------------------------
本应具有唯一ROWKEY的表如何继续用相同的ROWKEY添加更多“更新”行?
我实际上希望该表显示一个始终最新的类别列表,如https://www.youtube.com/watch?v=DPGn-j7yD68&list=PLa7VYi0yPIH2eX8q3mPpZAn3qCS1eDX8W&index=9中所述:
“表是事件的实例化视图,每个键只有最新值”。但是也许我误会了吗?
答案 0 :(得分:1)
随着新数据的到来,KSQL中的表会不断更新。写入表行的输出主题称为更改日志:它是表更改的不可变日志。如果特定密钥被多次更新,那么输出主题将包含同一密钥的多个消息。每个新值都将替换最后一个。
当您运行查询时,例如:
select * from categories;
在您使用的ksql版本中,您没有像传统RDBS那样运行传统查询。这样的查询将为您提供表中的当前行集。在ksql中,上述查询会将所有更新流式传输到行中。因此,如果同一密钥被多次更新;您将在查询中多次看到相同的键输出。
在ksqlDB的最新版本中,不会编写上述查询:
select * from categories emit changes;
在ksql内,每个键仅在物化表中存储一次,并且始终是所看到的最新版本。