我用两个节点设置了水槽。我想将slave01中的数据加载到hdfs。 slave01:example-conf.properties
agent.sources = baksrc
agent.channels = memoryChannel
agent.sinks =avro-forward-sink
agent.sources.baksrc.type = exec
agent.sources.baksrc.command = tail -F /root/hadoo/test/data.txt
agent.sources.baksrc.checkperiodic = 1000
agent.sources.baksrc.channels =memoryChannel
agent.channels.memoryChannel.type = memory
agent.channels.memoryChannel.keep-alive = 30
agent.channels.memoryChannel.capacity = 10000
agent.channels.memoryChannel.transactionCapacity = 10000
agent.sinks.avro-forward-sink.type = avro
agent.sinks.avro-forward-sink.hostname = master
agent.sinks.avro-forward-sink.port = 23004
agent.sinks.avro-forward-sink.channel = memoryChannel
master:example-conf.properties
agent.sources = avrosrc
agent.sinks =hdfs-write
agent.channels = memoryChannel
agent.sources.avrosrc.type =avro
agent.sources.avrosrc.bind =master
agent.sources.avrosrc.port =23004
agent.sources.avrosrc.channels=memoryChannel
agent.channels.memoryChannel.type = memory
agent.channels.memoryChannel.keep-alive = 30
agent.channels.memoryChannel.capacity = 10000
agent.channels.memoryChannel.transactionCapacity =10000
agent.sinks.hdfs-write.type = hdfs
agent.sinks.hdfs-write.hdfs.path =hdfs://172.16.86.38:9000/flume/webdata
agent.sinks.hdfs-write.hdfs.rollInterval = 0
agent.sinks.hdfs-write.hdfs.rollSize = 4000000
agent.sinks.hdfs-write.hdfs.rollCount = 0
agent.sinks.hdfs-write.hdfs.writeFormat = Text
agent.sinks.hdfs-write.hdfs.fileType = DataStream
agent.sinks.hdfs-write.hdfs.batchSize = 10
agent.sinks.hdfs-write.channel=memoryChannel
然后我运行一个shell脚本:像这样:
#!/bin/sh
for i in {1..1000000}; do
echo "test flume to Hbase $i" >>/root/hadoop/test/data.txt;
sleep 0.1;
done
开始说服: flume-ng agent --conf conf --conf-file example-conf.properties --name agent -Dflume.root.logger = DEBUG,console 我在控制台上没有出错。
14/05/06 16:38:44 INFO source.AvroSource: Avro source avrosrc stopping: Avro source avrosrc: { bindAddress: master, port: 23004 }
14/05/06 16:38:44 INFO ipc.NettyServer: [id: 0x49f2de1b, /172.16.86.39:9359 :> /172.16.86.38:23004] DISCONNECTED
14/05/06 16:38:44 INFO ipc.NettyServer: [id: 0x49f2de1b, /172.16.86.39:9359 :> /172.16.86.38:23004] UNBOUND
14/05/06 16:38:44 INFO ipc.NettyServer: [id: 0x49f2de1b, /172.16.86.39:9359 :> /172.16.86.38:23004] CLOSED
但我无法在hdfs中看到该文件,我的配置有问题吗? 我已经在master上测试过了,它工作正常。
答案 0 :(得分:0)
你使用哪种版本的水槽?
你设置了HADOOP_HOME吗?
水槽输出带有来自HADOOP_HOME的hadoop jar的类路径吗?
如果你使用apache flume,那么一步一步:
1.设置HADOOP_HOME
2.编辑hadoop core-site.xml,确保namenode ip是正确的
3.使用hdfs路径:agent.sinks.hdfs-write.hdfs.path =/flume/webdata
4.开始水槽