我们正在努力解决由Flume管理的从Kafka到HDFS的数据流。 由于下面描述的例外情况,数据未完全传输到hdfs。 但是,这个错误对我们来说有误导性,我们在数据目录和hdfs中都有足够的空间。我们认为它可能是通道配置的问题,但我们对其他来源有类似的配置,它可以正常工作。如果有人不得不处理这个问题,我会很感激提示。
17 Aug 2017 14:15:24,335 ERROR [Log-BackgroundWorker-channel1] (org.apache.flume.channel.file.Log$BackgroundWorker.run:1204) - Error doing checkpoint
java.io.IOException: Usable space exhausted, only 0 bytes remaining, required 524288000 bytes
at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:1003)
at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:986)
at org.apache.flume.channel.file.Log.access$200(Log.java:75)
at org.apache.flume.channel.file.Log$BackgroundWorker.run(Log.java:1201)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
17 Aug 2017 14:15:27,552 ERROR [PollableSourceRunner-KafkaSource-kafkaSource] (org.apache.flume.source.kafka.KafkaSource.doProcess:305) - KafkaSource EXCEPTION, {}
org.apache.flume.ChannelException: Commit failed due to IO error [channel=channel1]
at org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:639)
at org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
at org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
at org.apache.flume.source.kafka.KafkaSource.doProcess(KafkaSource.java:286)
at org.apache.flume.source.AbstractPollableSource.process(AbstractPollableSource.java:58)
at org.apache.flume.source.PollableSourceRunner$PollingRunner.run(PollableSourceRunner.java:137)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Usable space exhausted, only 0 bytes remaining, required 524288026 bytes
at org.apache.flume.channel.file.Log.rollback(Log.java:722)
at org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:637)
... 6 more
Flume配置:
agent2.sources = kafkaSource
#sources defined
agent2.sources.kafkaSource.type = org.apache.flume.source.kafka.KafkaSource
agent2.sources.kafkaSource.kafka.bootstrap.servers = …
agent2.sources.kafkaSource.kafka.topics = pega-campaign-response
agent2.sources.kafkaSource.channels = channel1
# channels defined
agent2.channels = channel1
agent2.channels.channel1.type = file
agent2.channels.channel1.checkpointDir = /data/cloudera/.flume/filechannel/checkpointdirs/pega
agent2.channels.channel1.dataDirs = /data/cloudera/.flume/filechannel/datadirs/pega
agent2.channels.channel1.capacity = 10000
agent2.channels.channel1.transactionCapacity = 10000
#hdfs sinks
agent2.sinks = sink
agent2.sinks.sink.type = hdfs
agent2.sinks.sink.hdfs.fileType = DataStream
agent2.sinks.sink.hdfs.path = hdfs://bigdata-cls:8020/stage/data/pega/campaign-response/%d%m%Y
agent2.sinks.sink.hdfs.batchSize = 1000
agent2.sinks.sink.hdfs.rollCount = 0
agent2.sinks.sink.hdfs.rollSize = 0
agent2.sinks.sink.hdfs.rollInterval = 120
agent2.sinks.sink.hdfs.useLocalTimeStamp = true
agent2.sinks.sink.hdfs.filePrefix = pega-
df -h命令:
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rhel-root 26G 6.8G 18G 28% /
devtmpfs 126G 0 126G 0% /dev
tmpfs 126G 6.3M 126G 1% /dev/shm
tmpfs 126G 2.9G 123G 3% /run
tmpfs 126G 0 126G 0% /sys/fs/cgroup
/dev/sda1 477M 133M 315M 30% /boot
tmpfs 26G 0 26G 0% /run/user/0
cm_processes 126G 1.9G 124G 2% /run/cloudera-scm-agent/process
/dev/scinib 2.0T 53G 1.9T 3% /data
tmpfs 26G 20K 26G 1% /run/user/2000
答案 0 :(得分:1)
将通道类型更改为内存通道并对其进行测试以隔离磁盘空间问题。 agent2.channels.channel1.type = memory
此外,由于您的设置中已经有kafka,因此将其用作水槽通道。
答案 1 :(得分:0)
您的错误没有指向hdfs中的可用空间,它是您频道中使用的文件的本地磁盘中的自由空间。如果您在此处看到file-channel,您将看到默认值为524288000.检查可用的本地空间是否足够(根据您的错误,它似乎为0)。您还可以更改属性minimumRequiredSpace。