将主机名传递给netty

时间:2015-04-02 11:44:52

标签: networking apache-spark netty

背景:我有两台具有相同主机名的机器,我需要设置一个本地火花簇进行测试,设置一个主机和一个工作正常,但尝试使用该驱动程序运行一个应用程序会导致问题,netty似乎没有选择正确的主机(无论我放在那里,它只选择第一个主机)。

相同的主机名:

$ dig +short corehost
192.168.0.100
192.168.0.101

Spark配置(由master和本地worker使用):

export SPARK_LOCAL_DIRS=/some/dir
export SPARK_LOCAL_IP=corehost       // i tried various like 192.168.0.x for
export SPARK_MASTER_IP=corehost      // local, master and the driver
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_CORES=2
export SPARK_WORKER_MEMORY=2g
export SPARK_WORKER_INSTANCES=2
export SPARK_WORKER_DIR=/some/dir

Spark启动,我可以在web-ui中看到工作人员。 当我运行火花"工作"下面:

val conf = new SparkConf().setAppName("AaA")
                          // tried 192.168.0.x and localhost
                          .setMaster("spark://corehost:7077")
val sc = new SparkContext(conf)

我得到了这个例外:

15/04/02 12:34:04 INFO SparkContext: Running Spark version 1.3.0
15/04/02 12:34:04 WARN Utils: Your hostname, corehost resolves to a loopback address: 127.0.0.1; using 192.168.0.100 instead (on interface en1)
15/04/02 12:34:04 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
15/04/02 12:34:05 ERROR NettyTransport: failed to bind to corehost.home/192.168.0.101:0, shutting down Netty transport
...
Exception in thread "main" java.net.BindException: Failed to bind to: corehost.home/192.168.0.101:0: Service 'sparkDriver' failed after 16 retries!
    at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
    at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:393)
    at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:389)
    at scala.util.Success$$anonfun$map$1.apply(Try.scala:206)
    at scala.util.Try$.apply(Try.scala:161)
    at scala.util.Success.map(Try.scala:206)
    at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
    at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
    at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
    at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.processBatch$1(BatchingExecutor.scala:67)
    at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:82)
    at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
    at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
    at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
    at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58)
    at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
15/04/02 12:34:05 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/04/02 12:34:05 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/04/02 12:34:05 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.

Process finished with exit code 1

不确定如何继续......它是一个完整的IP地址丛林。 不确定这是否也是一个网络问题。

1 个答案:

答案 0 :(得分:2)

我对同一问题的体验是,它围绕着在本地进行设置。尝试在您的火花驱动程序代码中更详细,将SPARK_LOCAL_IP和驱动程序主机IP添加到配置中:

val conf = new SparkConf().setAppName("AaA")
                          .setMaster("spark://localhost:7077")
                          .set("spark.local.ip","192.168.1.100")
                          .set("spark.driver.host","192.168.1.100")

这应该告诉netty使用两个相同主机中的哪一个。