使用flume将数据转储到S3中

时间:2016-05-18 15:12:07

标签: hadoop amazon-web-services amazon-s3 flume

  

org.apache.flume.EventDeliveryException:java.lang.RuntimeException:   java.lang.ClassNotFoundException:Class   未找到org.apache.hadoop.fs.s3native.NativeS3FileSystem

我已经在core-site.xml中添加了。

 <property>
    <name>fs.s3n.impl</name>
    <value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
    <description>The FileSystem for s3n: (Native S3) uris.</description>
</property>

bin / hadoop classpath 的/ opt / hadoop的/ hadoop的在/ etc / hadoop的:/选择/的hadoop / hadoop的/共享/ hadoop的/普通/ LIB / 中:/ opt / hadoop的/ hadoop的/共享/ hadoop的/普通/ :/选择/ hadoop的/ hadoop的/共享/ hadoop的/ HDFS中:/ opt / hadoop的/ hadoop的/共享/ hadoop的/ HDFS / LIB / 中:/ opt / hadoop的/ hadoop的/共享/ hadoop的/ HDFS / :/选择/ hadoop的/ hadoop的/共享/ hadoop的/纱线/ LIB / 中:/ opt / hadoop的/ hadoop的/共享/ hadoop的/纱线/ 中:/ opt / hadoop的/ hadoop的/共享/ hadoop的/映射精简/ LIB / 中:/ opt / Hadoop的/ Hadoop的/股/的Hadoop / MapReduce的/ 的:: /选择/ Hadoop的/ Hadoop的/股/的Hadoop /工具/ lib中/的中:/ opt / Hadoop的/ Hadoop的//的contrib /能力调度器/ 的.jar

Hadoop版本是2.7.1
Flume版本是1.6.0

Flume Agent配置:

flume1.sources  =

flume1.channels = kafka-channel-allMsgs kafka-channel-user-match-served-stream

flume1.sinks    = s3-sink-user-match-served-stream s3-sink-allMsgs 

flume1.channels.kafka-channel-allMsgs.type = org.apache.flume.channel.kafka.KafkaChannel

flume1.channels.kafka-channel-allMsgs.brokerList = 10.0.1.175:9092 , 10.0.1.229:9092

flume1.channels.kafka-channel-allMsgs.zookeeperConnect = 10.0.1.60:2181

flume1.channels.kafka-channel-allMsgs.topic = allMsgs

flume1.channels.kafka-channel-allMsgs.groupId = s3_flume_events

flume1.channels.kafka-channel-allMsgs.readSmallestOffset = false

flume1.channels.kafka-channel-allMsgs.parseAsFlumeEvent = false


flume1.channels.kafka-channel-user-match-served-stream.type = org.apache.flume.channel.kafka.KafkaChannel

flume1.channels.kafka-channel-user-match-served-stream.brokerList = 10.0.1.175:9092 , 10.0.1.229:9092

flume1.channels.kafka-channel-user-match-served-stream.zookeeperConnect = 10.0.1.60:2181

flume1.channels.kafka-channel-user-match-served-stream.topic = user_match_served_stream

flume1.channels.kafka-channel-user-match-served-stream.groupId = s3_flume_matched_served_stream

flume1.channels.kafka-channel-user-match-served-stream.readSmallestOffset = false

flume1.channels.kafka-channel-user-match-served-stream.parseAsFlumeEvent = false

flume1.sinks.s3-sink-user-match-served-stream.channel = kafka-channel-user-match-served-stream

flume1.sinks.s3-sink-user-match-served-stream.type = hdfs

flume1.sinks.s3-sink-user-match-served-stream.hdfs.filePrefix = user_match

flume1.sinks.s3-sink-user-match-served-stream.hdfs.useLocalTimeStamp = true

flume1.sinks.s3-sink-user-match-served-stream.hdfs.path=s3n://<aws>:<aws>@bucket/served_message/%y-%m/%y-%m-%d

flume1.sinks.s3-sink-user-match-served-stream.hdfs.batchSize = 1024

flume1.sinks.s3-sink-user-match-served-stream.hdfs.rollCount=1270000

flume1.sinks.s3-sink-user-match-served-stream.hdfs.rollInterval=1800

flume1.sinks.s3-sink-user-match-served-stream.hdfs.rollSize=133169152

flume1.sinks.s3-sink-user-match-served-stream.hdfs.codeC=bzip2

flume1.sinks.s3-sink-user-match-served-stream.hdfs.fileType=SequenceFile

flume1.sinks.s3-sink-allMsgs.channel = kafka-channel-allMsgs

flume1.sinks.s3-sink-allMsgs.type = hdfs

flume1.sinks.s3-sink-allMsgs.hdfs.filePrefix = allMsgs

flume1.sinks.s3-sink-allMsgs.hdfs.useLocalTimeStamp = true

flume1.sinks.s3-sink-allMsgs.hdfs.path=s3n://<aws>:<aws>@bucket/all_Msgs/%y-%m/%y-%m-%d

flume1.sinks.s3-sink-allMsgs.hdfs.batchSize = 1024

flume1.sinks.s3-sink-allMsgs.hdfs.rollCount=1270000

flume1.sinks.s3-sink-allMsgs.hdfs.rollInterval=1800

flume1.sinks.s3-sink-allMsgs.hdfs.rollSize=133169152

flume1.sinks.s3-sink-allMsgs.hdfs.codeC=bzip2

flume1.sinks.s3-sink-allMsgs.hdfs.fileType=SequenceFile

0 个答案:

没有答案