使用以下配置,我可以将samza连接到kafka-broker
systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory
systems.kafka.samza.msg.serde=json
systems.kafka.consumer.zookeeper.connect=localhost:2181/
systems.kafka.producer.bootstrap.servers=localhost:9092
但是我对SystemFactory类有些怀疑。如何编写我们自己的systemfactory类? SystemFactoryClass的目的是什么?请给我一些想法
答案 0 :(得分:3)
您可以通过扩展SystemFactory
接口并实现其三个抽象函数getConsumer
,getProducer
和getAdmin
来编写自己的系统工厂类。在每个函数中,以getConsumer
为例,您要创建一个系统客户,一个扩展SystemConsumer
的另一个自定义类的实例,并定义系统应该如何使用。通过这样做,您的Samza工作将知道如何在需要时获得系统的admin/consumer/producer
。
示例(在Scala中):
class YourSystemFactory extends SystemFactory {
override def getConsumer(systemName: String, config: Config, registry: MetricsRegistry): SystemConsumer = {
new YourSystemConsumer(
getAdmin(systemName, config).asInstanceOf[YourSystemAdmin],
config.get("someParam"))
}
override def getAdmin(systemName: String, config: Config): SystemAdmin = {
new YourSystemAdmin(
config.get("someParam"),
config.get("someOtherParam"))
)
}
override def getProducer(systemName: String, config: Config, registry: MetricsRegistry): SystemProducer = {
new YourSystemProducer(
getAdmin(systemName, config).asInstanceOf[YourSystemAdmin],
config.get("someParam"))
}
}
在你的配置中:
# Your system params
systems.your.samza.factory=your.package.YourSystemFactory
systems.your.consumer.param=value
systems.your.producer.param=value
答案 1 :(得分:0)
您不需要实施KafkaSystemFactory。您刚刚实现了StreamTask
示例:
public class MyTaskClass implements StreamTask {
public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) {
// process message
}
}
配置:
# This is the class above, which Samza will instantiate when the job is run
task.class=com.example.samza.MyTaskClass
# Define a system called "kafka" (you can give it any name, and you can define
# multiple systems if you want to process messages from different sources)
systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory
# The job consumes a topic called "PageViewEvent" from the "kafka" system
task.inputs=kafka.PageViewEvent
# Define a serializer/deserializer called "json" which parses JSON messages
serializers.registry.json.class=org.apache.samza.serializers.JsonSerdeFactory
# Use the "json" serializer for messages in the "PageViewEvent" topic
systems.kafka.streams.PageViewEvent.samza.msg.serde=json
了解更多信息:Documentation