我将Flume1.7升级到1.9。我在FLume的conf中有5个Kafka源和7个aws-s3的接收器。
首先我需要停止Flume 1.7,所以我执行命令
'kill ps -ef |grep flume |grep bidinfo | awk '{print $2}'
'停止bidinfo任务,但此过程仍然存在。22小时后,该过程一直存在到现在。如果没有此命令'kill -9 xxxx',我还能做什么?欢迎提出建议!
该服务器已经在Flume 1.7,kafka上运行了60天,我试图执行命令kill -3 xxxx,但是,只有两个源和两个接收器已停止,其他已运行。 我阅读了flume-kafkaSource的源代码,并观察了flume的日志。 可能有两种方法(消费者。唤醒();消费者。关闭();)没有完成。
#flume-conf
ag.sources = src_1 src_2 src_3 src_4 src_5
ag.channels = ch
ag.sinks = sk_1 sk_2 sk_3 sk_4 sk_5 sk_6 sk_7
#source:kafka
ag.sources.src_1.type = org.apache.flume.source.kafka.KafkaSource
ag.sources.src_1.kafka.bootstrap.servers = xxxx
ag.sources.src_1.kafka.consumer.group.id = flume.xxxxx
ag.sources.src_1.kafka.consumer.retry.backoff.ms = 10000
ag.sources.src_1.batchSize = 5000
ag.sources.src_1.batchDurationMillis = 2000
ag.sources.src_1.kafka.topics = xxxx
ag.sources.src_1.interceptors = i1
ag.sources.src_1.interceptors.i1.type = xxxx.interceptor
ag.sources.src_1.channels = ch
#sink:aws S3
ag.sinks.sk_1.type = hdfs
ag.sinks.sk_1.hdfs.path = s3a://xxxxxxx
ag.sinks.sk_1.hdfs.filePrefix = %{minute}
ag.sinks.sk_1.hdfs.fileSuffix = .xxx.1.lzo
ag.sinks.sk_1.hdfs.rollSize = 0
ag.sinks.sk_1.hdfs.rollCount = 0
ag.sinks.sk_1.hdfs.rollInterval = 0
ag.sinks.sk_1.hdfs.idleTimeout = 180
ag.sinks.sk_1.hdfs.callTimeout = 600000
ag.sinks.sk_1.hdfs.closeTries = 5
ag.sinks.sk_1.hdfs.retryInterval = 60
ag.sinks.sk_1.hdfs.batchSize = 3000
ag.sinks.sk_1.hdfs.codeC = lzop
ag.sinks.sk_1.hdfs.fileType = CompressedStream
ag.sinks.sk_1.hdfs.writeFormat = Text
ag.sinks.sk_1.channel = ch
#channels
ag.channels.ch.type = memory
ag.channels.ch.capacity = 2000000
ag.channels.ch.transactionCapacity = 100000
09 Sep 2019 11:14:18,010 ERROR [PollableSourceRunner-KafkaSource-src_2] (org.apache.flume.source.kafka.KafkaSource.doProcess:314) - KafkaSource EXCEPTION, {}
org.apache.flume.ChannelException: java.lang.InterruptedException
at org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:154)
at org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
at org.apache.flume.source.kafka.KafkaSource.doProcess(KafkaSource.java:295)
at org.apache.flume.source.AbstractPollableSource.process(AbstractPollableSource.java:60)
at org.apache.flume.source.PollableSourceRunner$PollingRunner.run(PollableSourceRunner.java:133)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:582)
at org.apache.flume.channel.MemoryChannel$MemoryTransaction.doCommit(MemoryChannel.java:119)
at org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
... 5 more
#this place I think is the main problem
09 Sep 2019 11:14:18,011 INFO [PollableSourceRunner-KafkaSource-src_2] (org.apache.flume.source.PollableSourceRunner$PollingRunner.run:143) - Source runner interrupted. Exiting
09 Sep 2019 11:14:18,011 INFO [agent-shutdown-hook] (org.apache.kafka.clients.consumer.internals.AbstractCoordinator$2.onFailure:571) - LeaveGroup request failed with error
org.apache.kafka.clients.consumer.internals.SendFailedException
#-------------
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:149) - Component type: SOURCE, name: src_2 stopped
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:155) - Shutdown Metric for type: SOURCE, name: src_2. source.start.time == 1567911973591
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:161) - Shutdown Metric for type: SOURCE, name: src_2. source.stop.time == 1567998858018
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. source.kafka.commit.time == 2470886
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. source.kafka.empty.count == 0
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. source.kafka.event.get.time == 78823378
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. src.append-batch.accepted == 0
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. src.append-batch.received == 0
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. src.append.accepted == 0
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. src.append.received == 0
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. src.events.accepted == 767319160
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. src.events.received == 767366357
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. src.open-connection.count == 0