创建FlumeDStream时出现Yarn错误的Spark流java.net.BindException:无法分配请求的地址

时间:2014-12-18 14:41:31

标签: java scala hadoop spark-streaming flume-ng

我正在尝试从基于水槽推送的方法创建火花流。我在我的Yarn集群上运行spark。在启动流时,它无法绑定请求的地址。 我正在使用scala-shell来执行程序,下面是我正在使用的代码

import org.apache.spark.streaming.StreamingContext
import org.apache.spark.streaming.StreamingContext._
import org.apache.spark.streaming.Seconds
import org.apache.spark.streaming.flume._
var ssc = new StreamingContext(sc,Seconds(60))
var stream = FlumeUtils.createStream(ssc,"master.internal", 5858);
stream.print()
stream.count().map(cnt => "Received " + cnt + " flume events." ).print()
ssc.start()
ssc.awaitTermination()

Flume Agent无法写入此端口,因为此代码无法绑定5858端口。

Flume Stack Trace:


 [18-Dec-2014 15:20:13] [WARN] [org.apache.flume.sink.AbstractRpcSink.start(AbstractRpcSink.java:294) 294] Unable to create Rpc client using hostname: hostname, port: 5858
org.apache.flume.FlumeException: NettyAvroRpcClient { host: hadoop-master.nycloudlab.internal, port: 7575 }: RPC connection error
        at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:178)
        at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:118)
        at org.apache.flume.api.NettyAvroRpcClient.configure(NettyAvroRpcClient.java:624)



Caused by: java.io.IOException: Error connecting to /hostname:port
        at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:280)
        at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:206)
        at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:155)
        at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:164)
        ... 18 more
Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
        at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:396)
        at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:358)
        at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:274)
        ... 3 more

来自火花流的堆栈跟踪如下。

    14/12/18 19:57:48 ERROR scheduler.ReceiverTracker: Deregistered receiver for stream 0: Error starting receiver 0 - org.jboss.netty.channel.ChannelException: Failed to bind to: <server-name>/IP:5858
        at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
        at org.apache.avro.ipc.NettyServer.<init>(NettyServer.java:106)
        at org.apache.spark.streaming.flume.FlumeReceiver.initServer(FlumeInputDStream.scala:157)
        at org.apache.spark.streaming.flume.FlumeReceiver.onStart(FlumeInputDStream.scala:171)
        at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:121)
        at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:106)
        at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$9.apply(ReceiverTracker.scala:264)
        at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$9.apply(ReceiverTracker.scala:257)
        at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
        at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
        at org.apache.spark.scheduler.Task.run(Task.scala:54)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.BindException: Cannot assign requested ad`enter code here`dress
        at sun.nio.ch.Net.bind(Native Method)
        at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)
        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
        at org.jboss.netty.channel.socket.nio.NioServerBoss$RegisterTask.run(NioServerBoss.java:193)
        at org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:366)
        at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:290)
        at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42)
        ... 3 more

1 个答案:

答案 0 :(得分:0)

这个例子(org.apache.spark.examples.streaming.FlumeEventCount)有效:

// Create the context and set the batch size
val sparkConf = new SparkConf().setAppName("FlumeEventCount")
val ssc = new StreamingContext(sparkConf, batchInterval)

// Create a flume stream
val stream = FlumeUtils.createStream(ssc, host, port, StorageLevel.MEMORY_ONLY_SER_2)

// Print out the count of events received from this server in each batch
stream.count().map(cnt => "Received " + cnt + " flume events." ).print()

一些提示:

  • 使用 val 而不是 var
  • 在相关节点中使用完全 ip 而非主机名或修改 / etc / hosts