我目前在Flume中有这个配置
TwitterAgent.sources = Twitter
TwitterAgent.channels = MemChannel
TwitterAgent.sinks = HDFS
TwitterAgent.sources.Twitter.type= org.apache.flume.source.twitter.TwitterSource
TwitterAgent.sources.Twitter.channels = MemChannel
TwitterAgent.sources.Twitter.consumerKey=XXXXXXXXXXXX
TwitterAgent.sources.Twitter.consumerSecret=XXXXXXXXXXXX
TwitterAgent.sources.Twitter.accessToken=XXXXXXXXXXXX
TwitterAgent.sources.Twitter.accessTokenSecret=XXXXXXXXXXXX
TwitterAgent.sources.Twitter.maxBatchSize=1000
TwitterAgent.sinks.HDFS.channel=MemChannel
TwitterAgent.sinks.HDFS.type= hdfs
TwitterAgent.sinks.HDFS.hdfs.path= hdfs://quickstart.cloudera:8020/user/flume/tweets
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize=10
TwitterAgent.sinks.HDFS.hdfs.rollSize=0
TwitterAgent.sinks.HDFS.hdfs.rollCount= 10000
TwitterAgent.sinks.HDFS.hdfs.rollInterval = 600
TwitterAgent.channels.MemChannel.type= memory
TwitterAgent.channels.MemChannel.capacity = 10000
TwitterAgent.channels.MemChannel.transactionCapacity =100
twitter app auth键是正确的。而且我一直收到这个错误 在水槽日志文件中:
18/01/25 08:01:46 INFO conf.FlumeConfiguration: Processing:HDFS
18/01/25 08:01:46 INFO conf.FlumeConfiguration: Processing:HDFS
18/01/25 08:01:46 INFO conf.FlumeConfiguration: Processing:HDFS
18/01/25 08:01:46 INFO conf.FlumeConfiguration: Processing:HDFS
18/01/25 08:01:46 INFO conf.FlumeConfiguration: Added sinks: HDFS Agent: TwitterAgent
18/01/25 08:01:46 INFO conf.FlumeConfiguration: Processing:HDFS
18/01/25 08:01:46 INFO conf.FlumeConfiguration: Processing:HDFS
18/01/25 08:01:46 INFO conf.FlumeConfiguration: Processing:HDFS
18/01/25 08:01:46 INFO conf.FlumeConfiguration: Processing:HDFS
18/01/25 08:01:46 INFO conf.FlumeConfiguration: Processing:HDFS
18/01/25 08:01:46 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [TwitterAgent]
18/01/25 08:01:46 INFO node.AbstractConfigurationProvider: Creating channels
18/01/25 08:01:46 INFO channel.DefaultChannelFactory: Creating instance of channel MemChannel type memory
18/01/25 08:01:46 INFO node.AbstractConfigurationProvider: Created channel MemChannel
18/01/25 08:01:46 INFO source.DefaultSourceFactory: Creating instance of source Twitter, type org.apache.flume.source.twitter.TwitterSource
18/01/25 08:01:46 INFO twitter.TwitterSource: Consumer Key:'XXXXXXXXXXXX'
18/01/25 08:01:46 INFO twitter.TwitterSource: Consumer Secret:'XXXXXXXXXXXX'
18/01/25 08:01:46 INFO twitter.TwitterSource: Access Token:'XXXXXXXXXXXX'
18/01/25 08:01:46 INFO twitter.TwitterSource: Access Token Secret:'XXXXXXXXXXXX'
18/01/25 08:01:46 INFO sink.DefaultSinkFactory: Creating instance of sink: HDFS, type: hdfs
18/01/25 08:01:46 INFO node.AbstractConfigurationProvider: Channel MemChannel connected to [Twitter, HDFS]
18/01/25 08:01:46 INFO node.Application: Starting new configuration:{ sourceRunners:{Twitter=EventDrivenSourceRunner: { source:org.apache.flume.source.twitter.TwitterSource{name:Twitter,state:IDLE} }} sinkRunners:{HDFS=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@58c9defc counterGroup:{ name:null counters:{} } }} channels:{MemChannel=org.apache.flume.channel.MemoryChannel{name: MemChannel}} }
18/01/25 08:01:46 INFO node.Application: Starting Channel MemChannel
18/01/25 08:01:47 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: MemChannel: Successfully registered new MBean.
18/01/25 08:01:47 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: MemChannel started
18/01/25 08:01:47 INFO node.Application: Starting Sink HDFS
18/01/25 08:01:47 INFO node.Application: Starting Source Twitter
18/01/25 08:01:47 INFO twitter.TwitterSource: Starting twitter source org.apache.flume.source.twitter.TwitterSource{name:Twitter,state:IDLE} ...
18/01/25 08:01:47 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SINK, name: HDFS: Successfully registered new MBean.
18/01/25 08:01:47 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: HDFS started
18/01/25 08:01:47 INFO twitter.TwitterSource: Twitter source Twitter started.
18/01/25 08:01:47 INFO twitter4j.TwitterStreamImpl: Establishing connection.
**18/01/25 08:01:49 INFO twitter4j.TwitterStreamImpl: Received fatal alert: access_denied
18/01/25 08:01:49 ERROR twitter.TwitterSource: Exception while streaming tweets
Received fatal alert: access_denied**
Relevant discussions can be found on the Internet at:
http://www.google.co.jp/search?q=d0031b0b or
http://www.google.co.jp/search?q=1db75522
TwitterException{exceptionCode=[d0031b0b-1db75522 db667dea-99334ae4], statusCode=-1, message=null, code=-1, retryAfter=-1, rateLimitStatus=null, version=3.0.3}
at twitter4j.internal.http.HttpClientImpl.request(HttpClientImpl.java:192)
我在此路径上创建了twitter.conf文件 / home / cloudera / flumeprac
我在终端上运行以下命令
[cloudera@quickstart flumeprac]$ flume-ng agent --conf-file
twitter.conf --name TwitterAgent --conf $FLUME_HOME/conf
_DFflume.root.logger=INFO,console
我通过以下链接 https://community.hortonworks.com/questions/58817/flume-twitter-agent-behind-proxy-error.html
https://stackoverflow.com/questions/25699558/issues-with-flume-hdfs-sink-from-twitter