无法使用flume将文件加载到Hadoop

时间:2015-05-20 11:01:21

标签: hadoop flume

我使用flume将文件移动到hdfs ...移动文件时显示此错误..请帮我解决此问题。

15/05/20 15:49:26 INFO instrumentation.MonitoredCounterGroup: Component type: SOURCE, name: r1 started
15/05/20 15:49:26 INFO avro.ReliableSpoolingFileEventReader: Preparing to move file /home/crayondata.com/shanmugapriya/apache-flume-1.5.2-bin/staging/HypeVisitorTest.java to /home/crayondata.com/shanmugapriya/apache-flume-1.5.2-bin/staging/HypeVisitorTest.java.COMPLETED
15/05/20 15:49:26 INFO source.SpoolDirectorySource: Spooling Directory Source runner has shutdown.
15/05/20 15:49:26 INFO hdfs.HDFSDataStream: Serializer = TEXT, UseRawLocalFileSystem = false
15/05/20 15:49:26 INFO hdfs.BucketWriter: Creating hdfs://localhost:9000/sha/HypeVisitorTest.java.1432117166377.tmp
15/05/20 15:49:26 ERROR hdfs.HDFSEventSink: process failed
java.lang.UnsupportedOperationException: Not implemented by the DistributedFileSystem FileSystem implementation
    at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:216)
    at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2564)
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2574)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
    at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:270)
    at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:262)
    at org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:718)
    at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:183)
    at org.apache.flume.sink.hdfs.BucketWriter.access$1700(BucketWriter.java:59)
    at org.apache.flume.sink.hdfs.BucketWriter$9.call(BucketWriter.java:715)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
15/05/20 15:49:26 ERROR flume.SinkRunner: Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: java.lang.UnsupportedOperationException: Not implemented by the DistributedFileSystem FileSystem implementation
    at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:471)
    at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
    at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.UnsupportedOperationException: Not implemented by the DistributedFileSystem FileSystem implementation
    at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:216)
    at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2564)
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2574)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
    at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:270)
    at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:262)
    at org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:718)
    at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:183)
    at org.apache.flume.sink.hdfs.BucketWriter.access$1700(BucketWriter.java:59)
    at org.apache.flume.sink.hdfs.BucketWriter$9.call(BucketWriter.java:715)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    ... 1 more
15/05/20 15:49:26 INFO source.SpoolDirectorySource: Spooling Directory Source runner has shutdown.

这是我的flumeconf.conf文件

# example.conf: A single-node Flume configuration

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = spooldir
a1.sources.r1.spoolDir = /home/shanmugapriya/apache-flume-1.5.2-bin/staging
a1.sources.r1.fileHeader = true
a1.sources.r1.maxBackoff = 10000
a1.sources.r1.basenameHeader = true

# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = hdfs://localhost:9000/sha
a1.sinks.k1.hdfs.writeFormat = Text
a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.rollInterval = 0
a1.sinks.k1.hdfs.rollSize = 0
a1.sinks.k1.hdfs.rollCount = 0
a1.sinks.k1.hdfs.idleTimeout = 100
a1.sinks.k1.hdfs.filePrefix = %{basename}

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 100000
a1.channels.c1.transactionCapacity = 1000
a1.channels.c1.byteCapacity = 0

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

请帮助解决这个问题.. TIA ......

1 个答案:

答案 0 :(得分:0)

@Shan请确认您在Apache Flume的类路径中有相关的Hadoop HDFS jar

同样从你的接收器到HDFS,我看到你有端口9000,但是默认端口通常是8020,这是正确的吗?