我正在尝试使用Flume将一些数据流式传输到HDFS,并将单个代理配置为具有netcat源,内存通道和HDFS接收器。
配置如下:
a1.sources = src1
a1.channels = ch1
a1.sinks = snk1
# SOURCES CONFIGURATION
a1.sources.src1.type = netcat
a1.sources.src1.bind = 0.0.0.0
a1.sources.src1.port = 99999
a1.sources.src1.ack-every-event = false
# SOURCE -> CHANNEL
a1.sources.src1.channels = ch1
# SINKS' CONFIGURATION
a1.sinks.snk1.type = hdfs
a1.sinks.snk1.hdfs.path = /somepath
a1.sinks.snk1.hdfs.writeFormat = Text
a1.sinks.snk1.hdfs.fileType = DataStream
a1.sinks.snk1.hdfs.inUseSuffix = .tmp
a1.sinks.snk1.hdfs.filePrefix = prefix_file
a1.sinks.snk1.hdfs.batchSize = 75000
a1.sinks.snk1.hdfs.rollInterval = 120
a1.sinks.snk1.hdfs.rollCount = 0
a1.sinks.snk1.hdfs.idleTimeout = 0
#128MB for each file maximum = 128 * 1024 (MB) * 1024 (KB) = ...
a1.sinks.snk1.hdfs.rollSize = 134217728
a1.sinks.snk1.hdfs.threadsPoolSize = 25
# SINK <- CHANNEL
a1.sinks.snk1.channel = ch1
# CHANNELS' CONFIGURATION
a1.channels.ch1.type = memory
a1.channels.ch1.capacity = 5000000
a1.channels.ch1.transactionCapacity = 100000
#412MB of byte capacity = 412 * 1024 * 1024 byte
#a1.channels.ch1.byteCapacity = 432013312
但是,如果我发送超过特定带宽的消息,我会收到以下异常:
2014-11-21 05:48:07,035 (netcat-handler-0) [WARN - org.apache.flume.source.NetcatSource$NetcatSocketHandler.processEvents(NetcatSource.java:407)] Error processing event. Exception follows.
org.apache.flume.ChannelException: Unable to put event on required channel: org.apache.flume.channel.MemoryChannel{name: ch1}
at org.apache.flume.channel.ChannelProcessor.processEvent(ChannelProcessor.java:275)
at org.apache.flume.source.NetcatSource$NetcatSocketHandler.processEvents(NetcatSource.java:394)
at org.apache.flume.source.NetcatSource$NetcatSocketHandler.run(NetcatSource.java:321)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flume.ChannelException: Cannot commit transaction. Heap space limit of 3456106reached. Please increase heap space allocated to the channel as the sinks may not be keeping up with the sources
at org.apache.flume.channel.MemoryChannel$MemoryTransaction.doCommit(MemoryChannel.java:123)
at org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
at org.apache.flume.channel.ChannelProcessor.processEvent(ChannelProcessor.java:267)
... 7 more
在我的 conf / flume-env.sh 中,我无法改变堆空间的价值:
JAVA_OPTS="-Xms256m -Xmx512m -Dcom.sun.management.jmxremote"
异常中堆空间的大小应以字节表示,这意味着我有一个3,3MB的堆空间,这是很低的,但我不明白这个值来自哪里......! 我怎么能解决这个问题?非常感谢你提前!
答案 0 :(得分:2)
您可以使用一些nobs来使其正常运行:
byteCapacity
:a1.channels.ch1.byteCapacity = 6912212
。JAVA_OPTS="-Xms512m -Xmx1024m -Dcom.sun.management.jmxremote"
)中的建议增加内存可能是最佳选择。原因是默认byteCapacity
是进程最大内存的80%,这已占用大量进程内存。byteCapacityBufferPercentage
,这会减少标题空间。