如何优雅地停止水槽代理

时间:2019-05-30 04:36:04

标签: hdfs flume

许多网站建议在停止水槽代理时使用kill -9。

但是,当我用kill -9停止代理时,HDFS接收器文件将永远打开(如* .tmp)。

如何优雅地停止水槽代理,以便该代理在停止之前关闭HDFS上的所有写入文件。

#Name the components on this agent
agent.sources = r1
agent.sinks = k1
agent.channels = c1

#Configure the Kafka Source
agent.sources.r1.type = org.apache.flume.source.kafka.KafkaSource
agent.sources.r1.batchSize = 1000
agent.sources.r1.batchDurationMillis = 3000
agent.sources.r1.kafka.bootstrap.servers = <server1>:6667,<server2>:6667
agent.sources.r1.kafka.topics = 1-agent1-thread
agent.sources.r1.kafka.consumer.group.id = flume_agent_thread

#Describe the sink
agent.sinks.k1.type = hdfs
agent.sinks.k1.hdfs.path = /user/flume/kafka-data/1-agent1-thread/%y%m%d/%H
agent.sinks.k1.hdfs.filePrefix = 1-agent1-thread

#Describing sink with the problem of Encoding
agent.sinks.k1.hdfs.fileType = DataStream
agent.sinks.k1.hdfs.writeFormat = Text

#Describing sink with the problem of many hdfs files
### Roll a file after certain amount of events occurs  ###
agent.sinks.k1.hdfs.rollInterval = 0
agent.sinks.k1.hdfs.rollSize = 0
agent.sinks.k1.hdfs.rollCount = 10000
agent.sinks.k1.hdfs.batchSize = 100
agent.sinks.k1.hdfs.idleTimeout = 300
agent.sinks.k1.hdfs.closeTries = 0
agent.sinks.k1.hdfs.retryInterval = 200

#Use a channel which buffers events in memory
agent.channels.c1.type = memory
agent.channels.c1.capacity = 10000
agent.channels.c1.transactionCapacity = 1000

#Bind the source and sink to the channel
agent.sources.r1.channels = c1
agent.sinks.k1.channel = c1

1 个答案:

答案 0 :(得分:0)

使用kill -TERM是杀死所有与Hadoop相关的服务的标准方法。

Flume将具有SIGTERM处理程序(See Application.java),该处理程序应清除所有打开的文件。

ShutdownHook() -> stop() -> stopAllComponents()

kill -9仅在您尝试过kill -TERM并且水槽代理仍挂起的情况下使用。