我使用flume(文件通道)收集日志并下沉到MQ。
有时,由于flume和MQ之间的连接超时异常,flume将自动重新启动,如以下日志片段所示。
为方便起见,删除重复或类似的日志。 正常吗?
24 十月 2018 20:40:23,192 INFO [lifecycleSupervisor-1-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start:62) - Configuration provider starting
24 十月 2018 20:40:23,200 INFO [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:134) - Reloading configuration file:/flume/conf/odps.conf
24 十月 2018 20:40:23,210 INFO [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) - Processing:k1
24 十月 2018 20:40:23,210 INFO [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) - Processing:k2
24 十月 2018 20:40:23,210 INFO [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) - Processing:k3
24 十月 2018 20:40:23,310 INFO [conf-file-poller-0] (org.apache.flume.channel.DefaultChannelFactory.create:42) - Creating instance of channel c1 type file
24 十月 2018 20:40:23,310 INFO [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.loadChannels:201) - Created channel c1
24 十月 2018 20:40:23,311 INFO [conf-file-poller-0] (org.apache.flume.source.DefaultSourceFactory.create:41) - Creating instance of source r1, type exec
24 十月 2018 20:40:23,339 INFO [conf-file-poller-0] (org.apache.flume.sink.DefaultSinkFactory.create:42) - Creating instance of sink: k1, type: com.aliyun.datahub.flume.sink.DatahubSink
24 十月 2018 20:40:23,350 INFO [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.getConfiguration:116) - Channel c1 connected to [r1, k1]
24 十月 2018 20:40:23,373 INFO [lifecycleSupervisor-1-3] (org.apache.flume.channel.file.Log.<init>:344) - Encryption is not enabled
24 十月 2018 20:40:23,375 INFO [lifecycleSupervisor-1-3] (org.apache.flume.channel.file.Log.replay:393) - Replay started
24 十月 2018 20:40:23,384 INFO [lifecycleSupervisor-1-6] (org.apache.flume.channel.file.Log.replay:405) - Found NextFileID 9, from [/flume/data/c7/log-8, /flume/data/c1/log-9]
24 十月 2018 20:40:23,391 INFO [lifecycleSupervisor-1-2] (org.apache.flume.channel.file.EventQueueBackingStoreFileV3.<init>:53) - Starting up with /flume/checkpoint/c1/checkpoint and /flume/checkpoint/c11/checkpoint.meta
24 十月 2018 20:40:23,393 INFO [lifecycleSupervisor-1-9] (org.apache.flume.channel.file.EventQueueBackingStoreFileV3.<init>:57) - Reading checkpoint metadata from /flume/checkpoint/c9/checkpoint.meta
24 十月 2018 20:40:23,540 INFO [lifecycleSupervisor-1-7] (org.apache.flume.channel.file.FlumeEventQueue.<init>:115) - QueueSet population inserting 0 took 0
24 十月 2018 20:40:23,547 INFO [lifecycleSupervisor-1-3] (org.apache.flume.channel.file.Log.replay:444) - Last Checkpoint Wed Oct 24 20:35:12 CST 2018, queue depth = 0
24 十月 2018 20:40:23,599 INFO [lifecycleSupervisor-1-8] (org.apache.flume.channel.file.Log.doReplay:529) - Replaying logs with v2 replay logic
但是,我在水槽日志中找到了“删除旧文件”,“旧文件”是指通道数据目录,似乎文件通道中的事件丢失了。我认为这是不合理的。
那么有一些“容错”配置选项吗?
24 十月 2018 20:42:52,657 INFO [Log-BackgroundWorker-c3] (org.apache.flume.channel.file.LogFile$RandomReader.close:520) - Closing RandomReader /flume/data/c3/log-13
24 十月 2018 20:42:52,659 INFO [Log-BackgroundWorker-c11] (org.apache.flume.channel.file.Log.writeCheckpoint:1052) - Updated checkpoint for file: /flume/data/c11/log-14 position: 4864 logWriteOrderID: 1540387389295
24 十月 2018 20:42:52,659 INFO [Log-BackgroundWorker-c11] (org.apache.flume.channel.file.Log.removeOldLogs:1108) - Removing old file: /flume/data/c11/log-10
24 十月 2018 20:42:52,659 INFO [Log-BackgroundWorker-c11] (org.apache.flume.channel.file.Log.removeOldLogs:1108) - Removing old file: /flume/data/c11/log-10.meta
24 十月 2018 20:42:52,660 INFO [Log-BackgroundWorker-c11] (org.apache.flume.channel.file.Log.removeOldLogs:1108) - Removing old file: /flume/data/c11/log-11
24 十月 2018 20:42:52,660 INFO [Log-BackgroundWorker-c7] (org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:252) - Updating checkpoint metadata: logWriteOrderID: 1540387389297, queueSize: 161908, queueHead: 71092
24 十月 2018 20:42:52,660 INFO [Log-BackgroundWorker-c11] (org.apache.flume.channel.file.Log.removeOldLogs:1108) - Removing old file: /flume/data/c11/log-11.meta
24 十月 2018 20:42:52,660 INFO [Log-BackgroundWorker-c11] (org.apache.flume.channel.file.Log.removeOldLogs:1108) - Removing old file: /flume/data/c11/log-12
24 十月 2018 20:42:52,660 INFO [Log-BackgroundWorker-c11] (org.apache.flume.channel.file.Log.removeOldLogs:1108) - Removing old file: /flume/data/c11/log-12.meta
24 十月 2018 20:42:52,665 INFO [Log-BackgroundWorker-c8] (org.apache.flume.channel.file.Log.writeCheckpoint:1052) - Updated checkpoint for file: /flume/data/c8/log-14 position: 3003 logWriteOrderID: 1540387389296
24 十月 2018 20:42:52,666 INFO [Log-BackgroundWorker-c8] (org.apache.flume.channel.file.LogFile$RandomReader.close:520) - Closing RandomReader /flume/data/c8/log-11
24 十月 2018 20:42:52,671 INFO [Log-BackgroundWorker-c8] (org.apache.flume.channel.file.LogFile$RandomReader.close:520) - Closing RandomReader /flume/data/c8/log-12
24 十月 2018 20:42:52,672 INFO [Log-BackgroundWorker-c7] (org.apache.flume.channel.file.Log.writeCheckpoint:1052) - Updated checkpoint for file: /flume/data/c7/log-15 position: 4658633 logWriteOrderID: 1540387389297
24 十月 2018 20:42:52,672 INFO [Log-BackgroundWorker-c7] (org.apache.flume.channel.file.LogFile$RandomReader.close:520) - Closing RandomReader /flume/data/c7/log-11
24 十月 2018 20:42:52,679 INFO [Log-BackgroundWorker-c7] (org.apache.flume.channel.file.LogFile$RandomReader.close:520) - Closing RandomReader /flume/data/c7/log-12
24 十月 2018 20:42:52,689 INFO [Log-BackgroundWorker-c7] (org.apache.flume.channel.file.LogFile$RandomReader.close:520) - Closing RandomReader /flume/data/c7/log-13
24 十月 2018 20:42:52,698 INFO [Log-BackgroundWorker-c7] (org.apache.flume.channel.file.LogFile$RandomReader.close:520) - Closing RandomReader /flume/data/c7/log-14
24 十月 2018 20:42:53,362 INFO [Log-BackgroundWorker-c2] (org.apache.flume.channel.file.EventQueueBackingStoreFile.beginCheckpoint:227) - Start checkpoint for /flume/checkpoint/c2/checkpoint, elements to sync = 16
24 十月 2018 20:42:53,370 INFO [Log-BackgroundWorker-c2] (org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:252) - Updating checkpoint metadata: logWriteOrderID: 1540387389766, queueSize: 70, queueHead: 4238