使用flume将文件加载到hdfs中

时间:2013-07-23 11:30:22

标签: flume

***我想将系统中的文本文件加载到hdfs。

这是我的conf文件:

agent.sources = seqGenSrc
agent.sinks = loggerSink
agent.channels = memoryChannel

agent.sources.seqGenSrc.type = exec
agent.sources.seqGenSrc.command = tail -F my.system.IP/D:/salespeople.txt

agent.sinks.loggerSink.type = hdfs
agent.sinks.loggerSink.hdfs.path = hdfs://IP.address:port:user/flume
agent.sinks.loggerSink.hdfs.filePrefix = events-
agent.sinks.loggerSink.hdfs.round = true
agent.sinks.loggerSink.hdfs.roundValue = 10
agent.sinks.loggerSink.hdfs.roundUnit = minute

agent.channels.memoryChannel.type = memory
agent.channels.memoryChannel.capacity = 1000
agent.channels.memoryChannel.transactionCapacity = 100

agent.sources.seqGenSrc.channels = memoryChannel

agent.sinks.loggerSink.channel = memoryChannel

* * 当我运行它时......我得到了跟随......然后它就卡住了。

13/07/23 16:30:44 INFO nodemanager.DefaultLogicalNodeManager: Starting Channel memoryChannel
13/07/23 16:30:44 INFO nodemanager.DefaultLogicalNodeManager: Waiting for channel: 
memoryChannel to start. Sleeping for 500 ms
13/07/23 16:30:44 INFO nodemanager.DefaultLogicalNodeManager: Starting Sink loggerSink
13/07/23 16:30:44 INFO nodemanager.DefaultLogicalNodeManager: Starting Source seqGenSrc
13/07/23 16:30:44 INFO source.ExecSource: Exec source starting with command:tail -F 10.48.226.27/D:/salespeople.txt

* * 哪里错了,或者错误是什么?

1 个答案:

答案 0 :(得分:0)

我假设您要将文件写入/ user / flume,因此您的路径应为:
agent.sinks.loggerSink.hdfs.path = hdfs://IP.address:port/user/flume

由于您的代理使用tail -F,因此没有消息告诉您它已完成(因为它永远不会是^^)。如果您想知道您的文件是否已创建,则必须查看/user/flume文件夹。

我正在使用像你这样的配置,它完美无缺。你可以尝试使用
-Dflume.root.logger=INFO,console了解更多信息?