我在从twitter加载数据到hdfs时收到错误
我正在使用ambari sandbox hortonworks hadoop-2.7
这是我的flume.conf文件
TwitterAgent.sources = Twitter
TwitterAgent.channels = MemChannel
TwitterAgent.sinks = HDFS
TwitterAgent.sources.Twitter.type =
com.cloudera.flume.source.TwitterSource
TwitterAgent.sources.Twitter.channels = MemChannel
TwitterAgent.sources.Twitter.consumerKey =oblBU8btK3OpuSoFce8fJTOz9
TwitterAgent.sources.Twitter.consumerSecret
=ofsGWmx1T4GHvi8qDcAySUAC3mVdvSS8VcfD9CPTejxzQ52izk
TwitterAgent.sources.Twitter.accessToken =3479003538-
2OP1N7wKqSkAohXscehBdhbMfJhoXqSPkng7cPY
TwitterAgent.sources.Twitter.accessTokenSecret
=0vrKLzdUplRnPjcTWiSNKhu9Ohe18FcoOXYMmD7OUazTt
TwitterAgent.sources.Twitter.keywords = hadoop, big data, analytics
TwitterAgent.sinks.HDFS.channel = MemChannel
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.hdfs.path =/flume/tweets
TwitterAgent.sinks.HDFS.hdfs.fileType =DataStream
TwitterAgent.sinks.HDFS.hdfs.filePrefix =twitter
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000
TwitterAgent.sinks.HDFS.hdfs.rollInterval = 10
TwitterAgent.channels.MemChannel.type = memory
TwitterAgent.channels.MemChannel.capacity = 10000
TwitterAgent.channels.MemChannel.transactionCapacity = 100
15/09/11 07:21:03 INFO twitter4j.TwitterStreamImpl:建立连接。 15/09/11 07:21:03 INFO twitter4j.TwitterStreamImpl:stream.twitter.com 15/09/11 07:21:03 INFO twitter4j.TwitterStreamImpl:等待1000毫秒 15/09/11 07:21:04 INFO twitter4j.TwitterStreamImpl:建立连接。 15/09/11 07:21:04 INFO twitter4j.TwitterStreamImpl:stream.twitter.com 15/09/11 07:21:04 INFO twitter4j.TwitterStreamImpl:等待2000毫秒 15/09/11 07:21:06 INFO twitter4j.TwitterStreamImpl:建立连接。 15/09/11 07:21:06 INFO twitter4j.TwitterStreamImpl:stream.twitter.com 15/09/11 07:21:06 INFO twitter4j.TwitterStreamImpl:等待4000毫秒 15/09/11 07:21:10 INFO twitter4j.TwitterStreamImpl:建立连接。 15/09/11 07:21:10 INFO twitter4j.TwitterStreamImpl:stream.twitter.com 15/09/11 07:21:10 INFO twitter4j.TwitterStreamImpl:等待8000毫秒 15/09/11 07:21:18 INFO twitter4j.TwitterStreamImpl:建立连接。 15/09/11 07:21:18 INFO twitter4j.TwitterStreamImpl:stream.twitter.com 15/09/11 07:21:18 INFO twitter4j.TwitterStreamImpl:等待16000毫秒 15/09/11 07:21:34 INFO twitter4j.TwitterStreamImpl:建立连接。 15/09/11 07:21:34 INFO twitter4j.TwitterStreamImpl:stream.twitter.com 15/09/11 07:21:34 INFO twitter4j.TwitterStreamImpl:等待16000毫秒
^ C15 / 09/11 07:21:45 INFO lifecycle.LifecycleSupervisor:停止生命周期监督10 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:组件类型:SINK,名称:HDFS已停止 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:Shutdown指标类型:SINK,名称:HDFS。 sink.start.time == 1441956061906 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:Shutdown指标类型:SINK,名称:HDFS。 sink.stop.time == 1441956105092 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:Shutdown指标类型:SINK,名称:HDFS。 sink.batch.complete == 0 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:Shutdown指标类型:SINK,名称:HDFS。 sink.batch.empty == 7 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:Shutdown指标类型:SINK,名称:HDFS。 sink.batch.underflow == 0 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:Shutdown指标类型:SINK,名称:HDFS。 sink.connection.closed.count == 0 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:Shutdown指标类型:SINK,名称:HDFS。 sink.connection.creation.count == 0 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:Shutdown指标类型:SINK,名称:HDFS。 sink.connection.failed.count == 0 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:Shutdown指标类型:SINK,名称:HDFS。 sink.event.drain.attempt == 0 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:Shutdown指标类型:SINK,名称:HDFS。 sink.event.drain.sucess == 0 15/09/11 07:21:45 INFO node.PollingPropertiesFileConfigurationProvider:配置提供程序停止 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:组件类型:CHANNEL,名称:MemChannel停止 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:Shutdown Metric for type:CHANNEL,name:MemChannel。 channel.start.time == 1441956061903 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:Shutdown Metric for type:CHANNEL,name:MemChannel。 channel.stop.time == 1441956105094 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:Shutdown Metric for type:CHANNEL,name:MemChannel。 channel.capacity == 10000 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:Shutdown Metric for type:CHANNEL,name:MemChannel。 channel.current.size == 0 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:Shutdown Metric for type:CHANNEL,name:MemChannel。 channel.event.put.attempt == 0 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:Shutdown Metric for type:CHANNEL,name:MemChannel。 channel.event.put.success == 0 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:Shutdown Metric for type:CHANNEL,name:MemChannel。 channel.event.take.attempt == 7 15/09/11 07:21:45 INFO instrumentation.MonitoredCounterGroup:Shutdown Metric for type:CHANNEL,name:MemChannel。 channel.event.take.success == 0 [root @ sandbox bin]#
答案 0 :(得分:0)
看起来你没有给出完整的hdfs路径:
TwitterAgent.sinks.HDFS.hdfs.path =hdfs://localhost:8020/flume/tweets
这里localhost是主机名和8020 hdfs端口。
希望对您有所帮助。如果您有任何问题,请告诉我。