我试图通过在MacBook Pro(10.9.2)上以独立模式运行Spark来学习Spark。我下载了&使用这里的说明构建它:
http://spark.apache.org/docs/latest/building-spark.html
然后我使用以下说明启动主服务器:
https://spark.apache.org/docs/latest/spark-standalone.html#starting-a-cluster-manually
虽然我必须将以下行添加到我的SPARK_HOME / conf / spark-env.sh文件中才能使其成功启动,但这有效:
export SPARK_MASTER_IP="127.0.0.1"
但是,当我尝试执行此命令以启动工作程序时:
./bin/spark-class org.apache.spark.deploy.worker.Worker spark://127.0.0.1:7077
失败并出现此错误:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/01/26 16:29:17 INFO Worker: Registered signal handlers for [TERM, HUP, INT]
15/01/26 16:29:17 INFO SecurityManager: Changing view acls to: erioconnor
15/01/26 16:29:17 INFO SecurityManager: Changing modify acls to: erioconnor
15/01/26 16:29:17 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(erioconnor); users with modify permissions: Set(erioconnor)
15/01/26 16:29:17 INFO Slf4jLogger: Slf4jLogger started
15/01/26 16:29:17 INFO Remoting: Starting remoting
15/01/26 16:29:17 ERROR NettyTransport: failed to bind to /10.252.181.130:0, shutting down Netty transport
15/01/26 16:29:17 ERROR Remoting: Remoting error: [Startup failed] [
akka.remote.RemoteTransportException: Startup failed
at akka.remote.Remoting.akka$remote$Remoting$$notifyError(Remoting.scala:136)
at akka.remote.Remoting.start(Remoting.scala:201)
at akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184)
at akka.actor.ActorSystemImpl.liftedTree2$1(ActorSystem.scala:618)
at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:615)
at akka.actor.ActorSystemImpl._start(ActorSystem.scala:615)
at akka.actor.ActorSystemImpl.start(ActorSystem.scala:632)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:141)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:118)
at org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:121)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:54)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53)
at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1676)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1667)
at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:56)
at org.apache.spark.deploy.worker.Worker$.startSystemAndActor(Worker.scala:495)
at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:475)
at org.apache.spark.deploy.worker.Worker.main(Worker.scala)
Caused by: org.jboss.netty.channel.ChannelException: Failed to bind to: /10.252.181.130:0
at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:393)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:389)
at scala.util.Success$$anonfun$map$1.apply(Try.scala:206)
at scala.util.Try$.apply(Try.scala:161)
at scala.util.Success.map(Try.scala:206)
at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.processBatch$1(BatchingExecutor.scala:67)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:82)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.net.BindException: Can't assign requested address
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:444)
at sun.nio.ch.Net.bind(Net.java:436)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.jboss.netty.channel.socket.nio.NioServerBoss$RegisterTask.run(NioServerBoss.java:193)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:372)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:296)
at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
]
15/01/26 16:29:17 WARN Utils: Service 'sparkWorker' could not bind on port 0. Attempting port 1.
15/01/26 16:29:17 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/01/26 16:29:17 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/01/26 16:29:17 INFO Remoting: Remoting shut down
15/01/26 16:29:17 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
在进程放弃之前,错误总共重复16次。
我不理解对IP地址10.252.181.130的引用。它并没有出现在我能够找到的Spark代码/配置中的任何位置,并且谷歌搜索它没有显示任何结果。我在启动主服务器时注意到日志文件中的这一行:
15/01/26 16:27:18 INFO MasterWebUI: Started MasterWebUI at http://10.252.181.130:8080
我查看了MasterWebUI的Scala源代码(以及它扩展的WebUI),并注意到他们似乎从名为SPARK_PUBLIC_DNS的环境变量中获取了该IP地址。我尝试在我的spark-env.sh脚本中将该变量设置为127.0.0.1,但这并没有奏效。实际上,它阻止了我的主服务器启动:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/01/26 19:46:16 INFO Master: Registered signal handlers for [TERM, HUP, INT]
Exception in thread "main" java.net.UnknownHostException: LM-PDX-00871419: LM-PDX-00871419: nodename nor servname provided, or not known
at java.net.InetAddress.getLocalHost(InetAddress.java:1473)
at org.apache.spark.util.Utils$.findLocalIpAddress(Utils.scala:620)
at org.apache.spark.util.Utils$.localIpAddress$lzycompute(Utils.scala:612)
at org.apache.spark.util.Utils$.localIpAddress(Utils.scala:612)
at org.apache.spark.util.Utils$.localIpAddressHostname$lzycompute(Utils.scala:613)
at org.apache.spark.util.Utils$.localIpAddressHostname(Utils.scala:613)
at org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665)
at org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.util.Utils$.localHostName(Utils.scala:665)
at org.apache.spark.deploy.master.MasterArguments.<init>(MasterArguments.scala:27)
at org.apache.spark.deploy.master.Master$.main(Master.scala:819)
at org.apache.spark.deploy.master.Master.main(Master.scala)
Caused by: java.net.UnknownHostException: LM-PDX-00871419: nodename nor servname provided, or not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293)
at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
... 12 more
(请注意,LM-PDX-00871419是我的Mac的主机名。)
所以在这一点上,我有点难过。我非常感谢任何有关下一步的建议。
答案 0 :(得分:5)
如果您使用IntelliJ IDEA,知道可以在运行/调试配置中将环境变量SPARK_LOCAL_IP设置为127.0.0.1可能会有所帮助。
答案 1 :(得分:3)
我在使用环回地址127.0.0.1
或localhost
时遇到问题。我在机器的实际公共IP地址方面取得了更大的成功,例如192.168.1.101
。我也不知道10.252.181.130
的来源。
对于Mac的主机名,请尝试添加&#34; .local&#34;结束如果这不起作用,请向/etc/hosts
添加条目。
最后,我建议使用sbin/spark-master.sh
和sbin/spark-slave.sh
脚本来运行这些服务。