无法连接到KSql中的外部主题

时间:2018-08-06 23:09:44

标签: apache-kafka confluent ksql

我对Confluent KSql非常陌生,但对Kafka并不陌生。我有在Kafka中作为Avro序列化数据存在的现有主题。我已经启动并运行了Confluent模式注册表,并配置了KSql以指向注册表。

当我尝试基于一个主题创建表时,KSql抱怨找不到该流。当我尝试在KSql中创建仅在KSql中流式传输我的主题的流时,似乎无法指向注册表中具有引用的Avro序列化主题。

有人知道如何解决这两个问题吗?我想使用KSql的方式不适合它可以做什么吗?

更新

更多细节

ksql> show topics;

 Kafka Topic                                                                                 | Registered | Partitions | Partition Replicas | Consumers | Consumer Groups
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 COM_FINDOLOGY_MODEL_REPORTING_OUTGOINGFEEDADVERTISERSEARCHDATA                              | false      | 2          | 2                  | 0         | 0
 COM_FINDOLOGY_MODEL_TRAFFIC_CPATRACKINGCALLBACK                                             | false      | 2          | 2                  | 0         | 0
 COM_FINDOLOGY_MODEL_TRAFFIC_ENTRYPOINTCLICK                                                 | true       | 10         | 3                  | 0         | 0

KSql配置

#bootstrap.servers=localhost:9092
bootstrap.servers=host1:9092,host2:9092,host3:9092,host4:9092,host5:9092

#listeners=http://localhost:8088
listeners=http://localhost:59093

ksql.server.ui.enabled=true

ksql.schema.registry.url=http://host1:59092

注册表配置

# The host name advertised in ZooKeeper. Make sure to set this if running Schema Registry with multiple nodes.
host.name: x.x.x.x
listeners=http://0.0.0.0:59092

# Zookeeper connection string for the Zookeeper cluster used by your Kafka cluster
# (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
#kafkastore.connection.url=localhost:2181

# Alternatively, Schema Registry can now operate without Zookeeper, handling all coordination via
# Kafka brokers. Use this setting to specify the bootstrap servers for your Kafka cluster and it
# will be used both for selecting the master schema registry instance and for storing the data for
# registered schemas.
# (Note that you cannot mix the two modes; use this mode only on new deployments or by shutting down
# all instances, switching to the new configuration, and then starting the schema registry
# instances again.)
kafkastore.bootstrap.servers=PLAINTEXT://host1:9092,PLAINTEXT://host2:9092,PLAINTEXT://host3:9092,PLAINTEXT://host4:9092,PLAINTEXT://host5:9092

# The name of the topic to store schemas in
kafkastore.topic=_schemas

# If true, API requests that fail will include extra debugging information, including stack traces
debug=false

尝试通过声明外部主题来解决问题

ksql> register  topic xxx with (value_format='avro', kafka_topic='COM_FINDOLOGY_MODEL_REPORTING_OUTGOINGFEEDADVERTISERSEARCHDATA');
You need to provide avro schema file path for topics in avro format.

2 个答案:

答案 0 :(得分:0)

REGISTER TOPIC是不推荐使用的语法。您应该使用CREATE STREAM(或CREATE TABLE,具体取决于您的数据访问要求)。

所以您的陈述应如下所示:

CREATE STREAM MY_STREAM_1 \
  WITH (VALUE_FORMAT='AVRO', \
  KAFKA_TOPIC='COM_FINDOLOGY_MODEL_REPORTING_OUTGOINGFEEDADVERTISERSEARCHDATA');

请注意,我使用\打破了行的可读性;您不必这样做。

答案 1 :(得分:0)

更改了我从Kafka主题中使用的信息(而不是使用整个主题内容)后,我解决了我遇到的问题。该主题包含使用ReflectionData创建的Avro编码数据(确定)。 KSql在处理流中的非标准项目时遇到问题,但是只要存在对应的KSql数据类型,就可以处理ReflectionData项目。我通过在KSql中创建一个新的流来解决此问题,该流仅选择了我需要的也与KSql兼容的项。完成后,我就可以处理较大流中的所需内容。

评论我认为这在KSql中有点不足,您必须在Kafka中创建新的 actual 中间主题来处理数据。我认为更好的解决方案是将中间流视为View到实际流中。在我理解为KTable之前,需要中介主题来保存累积量和已处理项目。