为什么可选的水槽通道会导致非可选的水槽通道出现问题?

时间:2015-06-25 17:03:29

标签: java apache-kafka flume avro flume-ng

我有一个简单的Flume配置,它给了我很多问题。我先描述一下这个问题,然后列出配置文件。

我有3台服务器:Server1,Server2,Server3。

服务器1: Netcat源码/ Syslogtcp源码(我在netcat上测试了这个,没有acks和syslogtcp) 2个内存通道 2个Avro接收器(每个通道一个) 用第二个内存通道选择复制选择器

Server2,3: Avro来源 记忆通道 卡夫卡下沉

在我的模拟中,Server2正在模拟"生产"因此不能经历任何数据丢失,而Server3正在模拟"开发"数据丢失很好。 我的假设是使用2个通道和2个源将两个服务器相互分离,如果Server3发生故障,它将不会影响Sever2(特别是使用可选的配置选项!)。然而,这种情况并非如此。当我运行我的模拟并使用CTRL-C终止Server3时,我遇到Server2的速度减慢,从Server2到Kafka接收器的输出变成了爬行。当我在Server3上恢复Flume代理时,一切都恢复正常。

我没想到这种行为。我的预期是因为我有两个通道和两个接收器,如果一个通道和/或接收器发生故障,另一个通道和/或接收器应该没有问题。这是Flume的限制吗?这是我的来源,汇点还是频道的限制?有没有办法让Flume表现出我使用一个具有多个通道的接收器和彼此分离的接收器?我真的不想在一台机器上为每个"环境"设置多个Flume代理。 (生产和发展)。随附的是我的配置文件,因此您可以通过更加技术化的方式查看我所做的事情:

SERVER1(第一级代理人)

#Describe the top level configuration    
agent.sources = mySource
agent.channels = defaultChannel1 defaultChannel2
agent.sinks = mySink1 mySink2

#Describe/configure the source
agent.sources.mySource.type = netcat
agent.sources.mySource.port = 6666
agent.sources.mySource.bind = 0.0.0.0
agent.sources.mySource.max-line-length = 150000
agent.sources.mySource.ack-every-event = false
#agent.sources.mySource.type = syslogtcp
#agent.sources.mySource.host = 0.0.0.0
#agent.sources.mySource.port = 7103
#agent.sources.mySource.eventSize = 150000
agent.sources.mySource.channels = defaultChannel1 defaultChannel2
agent.sources.mySource.selector.type = replicating
agent.sources.mySource.selector.optional = defaultChannel2

#Describe/configure the channel
agent.channels.defaultChannel1.type = memory
agent.channels.defaultChannel1.capacity = 5000
agent.channels.defaultChannel1.transactionCapacity = 200

agent.channels.defaultChannel2.type = memory
agent.channels.defaultChannel2.capacity = 5000
agent.channels.defaultChannel2.transactionCapacity = 200

#Avro Sink
agent.sinks.mySink1.channel = defaultChannel1
agent.sinks.mySink1.type = avro
agent.sinks.mySink1.hostname = Server2
agent.sinks.mySink1.port = 6666

agent.sinks.mySink2.channel = defaultChannel2
agent.sinks.mySink2.type = avro
agent.sinks.mySink2.hostname = Server3
agent.sinks.mySink2.port = 6666

SERVER2" PROD" FLUME AGENT

#Describe the top level configuration
agent.sources = mySource
agent.channels = defaultChannel
agent.sinks = mySink

#Describe/configure the source
agent.sources.mySource.type = avro
agent.sources.mySource.port = 6666
agent.sources.mySource.bind = 0.0.0.0
agent.sources.mySource.max-line-length = 150000
agent.sources.mySource.channels = defaultChannel

#Describe/configure the interceptor
agent.sources.mySource.interceptors = myInterceptor
agent.sources.mySource.interceptors.myInterceptor.type = myInterceptor$Builder

#Describe/configure the channel
agent.channels.defaultChannel.type = memory
agent.channels.defaultChannel.capacity = 5000
agent.channels.defaultChannel.transactionCapacity = 200

#Describe/configure the sink
agent.sinks.mySink.type = org.apache.flume.sink.kafka.KafkaSink
agent.sinks.mySink.topic = Server2-topic
agent.sinks.mySink.brokerList = broker1:9092, broker2:9092
agent.sinks.mySink.requiredAcks = -1
agent.sinks.mySink.batchSize = 100
agent.sinks.mySink.channel = defaultChannel

SERVER3" DEV" FLUME AGENT

#Describe the top level configuration
agent.sources = mySource
agent.channels = defaultChannel
agent.sinks = mySink

#Describe/configure the source
agent.sources.mySource.type = avro
agent.sources.mySource.port = 6666
agent.sources.mySource.bind = 0.0.0.0
agent.sources.mySource.max-line-length = 150000
agent.sources.mySource.channels = defaultChannel

#Describe/configure the interceptor
agent.sources.mySource.interceptors = myInterceptor
agent.sources.mySource.interceptors.myInterceptor.type = myInterceptor$Builder

#Describe/configure the channel
agent.channels.defaultChannel.type = memory
agent.channels.defaultChannel.capacity = 5000
agent.channels.defaultChannel.transactionCapacity = 200

#Describe/configure the sink
agent.sinks.mySink.type = org.apache.flume.sink.kafka.KafkaSink
agent.sinks.mySink.topic = Server3-topic
agent.sinks.mySink.brokerList = broker1:9092, broker2:9092
agent.sinks.mySink.requiredAcks = -1
agent.sinks.mySink.batchSize = 100
agent.sinks.mySink.channel = defaultChannel 

感谢您的帮助!

1 个答案:

答案 0 :(得分:0)

我会考虑调整这些配置参数,因为它与内存通道有关:

  

agent.channels.defaultChannel.capacity = 5000   agent.channels.defaultChannel.transactionCapacity = 200

可能先尝试加倍,再次进行测试,你应该看到改进:

  

agent.channels.defaultChannel.capacity = 10000   agent.channels.defaultChannel.transactionCapacity = 400

在测试期间观察Apache Flume实例的JVM也是很好的