我对Confluent KSql非常陌生,但对Kafka并不陌生。我有在Kafka中作为Avro序列化数据存在的现有主题。我已经启动并运行了Confluent模式注册表,并配置了KSql以指向注册表。
当我尝试基于一个主题创建表时,KSql抱怨找不到该流。当我尝试在KSql中创建仅在KSql中流式传输我的主题的流时,似乎无法指向注册表中具有引用的Avro序列化主题。
有人知道如何解决这两个问题吗?我想使用KSql的方式不适合它可以做什么吗?
更新
更多细节
ksql> show topics;
Kafka Topic | Registered | Partitions | Partition Replicas | Consumers | Consumer Groups
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
COM_FINDOLOGY_MODEL_REPORTING_OUTGOINGFEEDADVERTISERSEARCHDATA | false | 2 | 2 | 0 | 0
COM_FINDOLOGY_MODEL_TRAFFIC_CPATRACKINGCALLBACK | false | 2 | 2 | 0 | 0
COM_FINDOLOGY_MODEL_TRAFFIC_ENTRYPOINTCLICK | true | 10 | 3 | 0 | 0
KSql配置
#bootstrap.servers=localhost:9092
bootstrap.servers=host1:9092,host2:9092,host3:9092,host4:9092,host5:9092
#listeners=http://localhost:8088
listeners=http://localhost:59093
ksql.server.ui.enabled=true
ksql.schema.registry.url=http://host1:59092
注册表配置
# The host name advertised in ZooKeeper. Make sure to set this if running Schema Registry with multiple nodes.
host.name: x.x.x.x
listeners=http://0.0.0.0:59092
# Zookeeper connection string for the Zookeeper cluster used by your Kafka cluster
# (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
#kafkastore.connection.url=localhost:2181
# Alternatively, Schema Registry can now operate without Zookeeper, handling all coordination via
# Kafka brokers. Use this setting to specify the bootstrap servers for your Kafka cluster and it
# will be used both for selecting the master schema registry instance and for storing the data for
# registered schemas.
# (Note that you cannot mix the two modes; use this mode only on new deployments or by shutting down
# all instances, switching to the new configuration, and then starting the schema registry
# instances again.)
kafkastore.bootstrap.servers=PLAINTEXT://host1:9092,PLAINTEXT://host2:9092,PLAINTEXT://host3:9092,PLAINTEXT://host4:9092,PLAINTEXT://host5:9092
# The name of the topic to store schemas in
kafkastore.topic=_schemas
# If true, API requests that fail will include extra debugging information, including stack traces
debug=false
尝试通过声明外部主题来解决问题
ksql> register topic xxx with (value_format='avro', kafka_topic='COM_FINDOLOGY_MODEL_REPORTING_OUTGOINGFEEDADVERTISERSEARCHDATA');
You need to provide avro schema file path for topics in avro format.
答案 0 :(得分:0)
REGISTER TOPIC
是不推荐使用的语法。您应该使用CREATE STREAM
(或CREATE TABLE
,具体取决于您的数据访问要求)。
所以您的陈述应如下所示:
CREATE STREAM MY_STREAM_1 \
WITH (VALUE_FORMAT='AVRO', \
KAFKA_TOPIC='COM_FINDOLOGY_MODEL_REPORTING_OUTGOINGFEEDADVERTISERSEARCHDATA');
请注意,我使用\
打破了行的可读性;您不必这样做。
答案 1 :(得分:0)
更改了我从Kafka主题中使用的信息(而不是使用整个主题内容)后,我解决了我遇到的问题。该主题包含使用ReflectionData
创建的Avro编码数据(确定)。 KSql
在处理流中的非标准项目时遇到问题,但是只要存在对应的KSql数据类型,就可以处理ReflectionData项目。我通过在KSql中创建一个新的流来解决此问题,该流仅选择了我需要的也与KSql兼容的项。完成后,我就可以处理较大流中的所需内容。
评论我认为这在KSql中有点不足,您必须在Kafka
中创建新的 actual 中间主题来处理数据。我认为更好的解决方案是将中间流视为View
到实际流中。在我理解为KTable之前,需要中介主题来保存累积量和已处理项目。