通过水槽将事件数据写入HDFS时出错

时间:2013-03-17 07:13:00

标签: hadoop cloudera flume

我正在使用cdh3 update 4 tarball进行开发。我已经开始运行了。现在,我还从cloudera viz 1.1.0下载了等效的flume tarball,并尝试使用hdfs-sink将日志文件的尾部写入hdfs。当我运行flume代理时,它开始运行正常但在尝试将新事件数据写入hdfs时最终会出错。我找不到比stackoverflow更好的发布这个问题的小组。 这是我正在使用的水槽配置

agent.sources=exec-source
agent.sinks=hdfs-sink
agent.channels=ch1

agent.sources.exec-source.type=exec
agent.sources.exec-source.command=tail -F /locationoffile

agent.sinks.hdfs-sink.type=hdfs
agent.sinks.hdfs-sink.hdfs.path=hdfs://localhost:8020/flume
agent.sinks.hdfs-sink.hdfs.filePrefix=apacheaccess

agent.channels.ch1.type=memory
agent.channels.ch1.capacity=1000

agent.sources.exec-source.channels=ch1
agent.sinks.hdfs-sink.channel=ch1

此外,这是一小段错误,当它收到新的事件数据并尝试将其写入hdfs时会在控制台中显示。

13/03/16 17:59:21 INFO hdfs.BucketWriter: Creating hdfs://localhost:8020/user/hdfs-user/flume/apacheaccess.1363436060424.tmp
13/03/16 17:59:22 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: Failed on local exception: java.io.IOException: Broken pipe; Host Details : local host is: "sumit-HP-Pavilion-dv3-Notebook-PC/127.0.0.1"; destination host is: "localhost":8020; 
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
    at org.apache.hadoop.ipc.Client.call(Client.java:1164)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
    at $Proxy9.create(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
    at $Proxy9.create(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:192)
    at org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1298)
    at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1317)
    at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1215)
    at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1173)
    at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:272)
    at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:261)
    at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:78)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:805)
    at org.apache.hadoop.io.SequenceFile$Writer.<init>(SequenceFile.java:1060)
    at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:270)
    at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:369)
    at org.apache.flume.sink.hdfs.HDFSSequenceFile.open(HDFSSequenceFile.java:65)
    at org.apache.flume.sink.hdfs.HDFSSequenceFile.open(HDFSSequenceFile.java:49)
    at org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:190)
    at org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:50)
    at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:157)
    at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:154)
    at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:127)
    at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:154)
    at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:316)
    at org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:718)
    at org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:715)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Broken pipe
    at sun.nio.ch.FileDispatcher.write0(Native Method)
    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:100)
    at sun.nio.ch.IOUtil.write(IOUtil.java:71)
    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
    at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:62)
    at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:143)
    at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:153)
    at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:114)
    at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
    at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
    at java.io.DataOutputStream.flush(DataOutputStream.java:106)
    at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:861)
    at org.apache.hadoop.ipc.Client.call(Client.java:1141)
    ... 37 more
13/03/16 17:59:27 INFO hdfs.BucketWriter: Creating hdfs://localhost:8020/user/hdfs-user/flume/apacheaccess.1363436060425.tmp
13/03/16 17:59:27 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: Failed on local exception: java.io.IOException: Broken pipe; Host Details : local host is: "sumit-HP-Pavilion-dv3-Notebook-PC/127.0.0.1"; destination host is: "localhost":8020; 

1 个答案:

答案 0 :(得分:0)

作为cloudera邮件列表suggest中的人,可能会出现此错误的原因:

  1. HDFS安全模式已开启。尝试运行hadoop fs -safemode leave并查看错误是否消失。
  2. Flume和Hadoop版本不匹配。要检查这个,请将flume / lib目录中的hadoop-core.jar替换为hadoop安装文件夹中的hadoop-core.jar。